Nucleic acid molecules specific for bacterial antigens and uses thereof

ABSTRACT

The present invention relates to nucleic acid molecules derived from: a gene eneding a transferase; or a gene encoding an enzyme for the transport or processing of a polysaccharide or oligosaccharide unit, including a wzx gene or a wzy gene, or a gene with a similar function; the gene being involved in the synthesis of a particular bacterial polysaccharide antigen, wherein the sequence of the nucleic acid molecule is specific to the particular bacterial polysaccharide antigen. Polysaccharides to which the invention relates include O antigens. The invention also relates to methods of testing samples for the presence of one or more bacterial polysaccharide antigens, using the nucleic acid molecules of the invention, and to kits containing the nucleic acid molecules of the invention.

TECHNICAL FIELD

[0001] The invention relates to novel nucleotide sequences located in a gene cluster which controls the synthesis of a bacterial polysaccharide antigen, especially an O antigen, and the use of those nucleotide sequences for the detection of bacteria which express particular polysaccharide antigens (particularly O antigens) and for the identification of the polysaccharide antigens (particularly O antigens) of those bacteria.

BACKGROUND ART

[0002] Enteropathogenic E. coli strains are well known causes of diarrhoea and haemorrhagic colitis in humans and can lead to potentially life threatening sequelae including haemolytic uremic syndrome and thrombotic thrombocytopaenic purpura. Some of these strains are commonly found in livestock and infection in humans is usually a consequence of consumption of contaminated meat or dairy products which have been improperly processed. The O specific polysaccharide component (the “O antigen”) of lipopolysaccharide is known to be a major virulence factor of enteropathogenic E. coli strains.

[0003] The E. coli O antigen is highly polymorphic and 166 different forms of the antigen have been defined; Ewing, W. H. [in Edwards and Ewings “Identification of the Enterobacteriacea” Elsevier. Amsterdam (1986)] discusses 128 different Oantigens while Lior H. (1994) extends the number to 166 [in “Classification of Escherichia coli In Escherichia coli in domestic animals and humans pp31-72. Edited by C. L. Gyles CAB International]. The genus Salmonella enterica has 46 known O antigen types [Popoff M. Y. et al (1992) “Antigenic formulas of the Salmonella enterica serovars” 6th revision WHO Collaborating Centre for Reference and Research on Salmonella enterica, Institut Pasteur Paris France].

[0004] An important step in determining the biosynthesis of O antigens-and therefore the mechanism of the polymorphism has been to characterise the gene clusters controlling O antigen biosynthesis. The genes specific for the synthesis of the O antigen are generally located in a gene cluster at map position 45 minutes on the chromosome of E. coli K-12 [Bachmann, B. J. 1990 “Linkage map of Escherichia coli K-12”. Milcrobiol. Rev. 54:130-197], and at the corresponding position in S. enterica LT2 [Sanderson et al (1995) “Genetic map of Salmonella enterica typhimurium”, Edition VIII Microbiol. Rev. 59: 241-3033]. In both cases the O antigen gene cluster is close to the gnd gene as is the case in other strains of E. coli and S. enterica [Reeves P. R. (1994) “Biosynthesis and assemby of lipopolysac,charide, 281-314. in A. Neuberger and L. L. M. van Deenen (eds) “Bacterial cell wall, new comprehensive biochemistry” vol 27 Elsevier Science Publishers]. These genes encode enzymes for the synthesis of nucleotide diphosphate sugars and for assembly of the sugars into oligosaccharide units and in general for polymerisation to O antigen.

[0005] The E. coli O antigen gene clusters for a wide range of E. coli O antigens have been cloned but the 07, 09, 016 and 0111 O antigens have been studied in more detail with only 09 and 016 having been fully characterised with regard to nucleotide sequence to date [Kido N., Torgov V. I., Sugiyama T., Uchiya K.,, Sugihara H., Komatsu T., Kato N. & Jann K. (1995) “Expression of the 09 polysaccharide of Escherichia coli: sequencing of the E. coli 09 rfb gene cluster, characterisation of mannosyl transferases, and evidence for an ATP-binding cassette transport system” J. of Bacteriol. 177 2178-2187; Stevenson G., Neal B., Liu D., Hobbs M., Packer N. H., Batley M., Redmond J. W., Lindguist L. & Reeves PR (1994) “Structure of the O antigen of E. coli K12 and the sequence of its rfb gene cluster” J. of Bacteriol. 176 4144-4156; Jayaratne, P. et al. (1991) “Cloning and analysis of duplicated rfbM and rfbK genes involved in the formation of GDP-mannose in Escherichia coli 09:K30 and participation of rfb genes in the synthesis of the group 1 K30 capsular polysaccharide” J. Bacteriol. 176: 3126-3139; Valvano, M. A. and Crosa, J. H.(1989)“Molecular cloning and expression in Escherichia coli K-12 of chromosomal genes determining the 07 lipopolysaccharide antigen of a human invasive strain of E.coli 07:Kl”. Inf and Immun. 57:937-943; Marolda C. L. And Valvano, M. A. (1993). “Identification, expression, and DNA sequence of the GDP-mannose biosynthesis genes encoded by the 07 rfb gene cluster of strain VW187 (Eschericia coli 07:K1)”. J. Bacteriol. 175:148-158.1].

[0006] Bastin D. A., et al. 1991 [“Molecular cloning and expression in Escherichia coli K-12 of the rfb gene cluster determining the O antigen of an E.coli 0111 strain”. Mol. Microbiol. 5:9 2223-2231] and Bastin D. A. and Reeves, P. R. [(1995)“Sequence and analysis of the O antigen gene(rfb)cluster of Escherichia coli 0111”. Gene 164: 17-23] isolated chromosomal DNA encoding the E. coli 0111 rfb region and characterised a 6962 bp fragment of E. coli 0111 rfb. Six open reading frames (orfs) were identified in the 6962 bp partial fragment and the alignment of the sequences of these orfs revealed homology with genes of the GDP-mannose pathway, rfbK and rfbM, and other rfb and cps genes.

[0007] The nucleotide sequences of the loci which control expression of Salmonella enterica B, A, D1, D2, D3, C1, C2 and E O antigens have been characterised (Brown, P. K., L. K. Romana and P. R. Reeves (1991) “Cloning of the rfb gene cluster of a group C2 Salmonella enterica”: comparison with the rfb regions of groups B and D Mol. Microbiol. 5:1873-1881; Jiang, X.-M., B. Neal, F. Santiago, S. J. Lee, L. K. Romana, and P. R. Reeves (1991) “Structure and sequence of the rfb (O antigen) gene cluster of Salmonella enterica serovar typhimurium (LT2)”. Mol. Microbiol. 5:692-713; Lee, S. J., L. K. Romana, and P. R. Reeves (1992) “Sequences and structural analysis of the rfb (O antigen)gene cluster from a group C1 Salmonella enterica enterica strain” J. Gen. Microbiol. 138: 1843-1855; Lui, D., N. K. Verma, L. K. Romana, and P. R. Reeves (1991) “Relationship among the rfb regions of Salmonella enterica serovars A, B and D” J. Bacteriol. 173: 4814-4819; Verma, N. K., and P. Reeves (1989) “Identification and sequence of rfbS and rfbE, which determine the antigenic specificity of group A and group D Salmonella entericae” J. Bacteriol. 171: 5694-5701; Wang, L., L. K. Romana, and P. R. Reeves (1992) “Molecular analysis of a Salmonella enterica enterica group E1 rfb gene cluster: O antigen and the genetic basis of the major polymorphism” Genetics 130: 429-443; Wyk, P., and P. Reeves (1989). “Identification and sequence of the gene for abequose synthase, which confers antigenic specificity on group B Salmonella entericae: homology with galactose epimerase” J. Bacteriol. 171: 5687-5693,; Xiang, S. H., M. Hobbs, and P. R. Reeves. 1994 Molecular analysis of the rfb gene luster of a group D2 Salmonella enterica strain: evidence for its origin from an insertion sequence mediated recombination event between group E and D1 strains. J. Bacteriol. 176: 4357 -4365; Curd, H., D. Liu and P. R. Reeves, 1998. Relationships among the O antigen Salmonella enterica groups B, D1, D2, and D3. J. Bacteriol. 180: 1002-1007.).

[0008] Of the closely related Shicella (which really can be considered to be part of E. coli) S. dysenteriae and S. flexneri O antigens have been fully sequenced and are next to gnd. [Klena JD & Schnaitman CA (1993) “Function of the rfb gene cluster and the rfe gene in the synthesis of O antigen by Shigella dysenteriae 1” Mol. Microbiol. 9 393-402; Morona R., Mavris M., Fallarino A. & Manning P. (1994) “Characterisation of the rifc region of Shigella flexneri” J.Bacteriol 176: 733-747].

[0009] Inasmuch as the O antigen of enteropathogenic E. coli strains and the O antigen of Salmonella enterica strains are major virulence factors and are highly polymorphic, there is a real need to develop highly specific, sensitive, rapid and inexpensive diagnostic assays to detect E. coli and assays to detect S. enterica. There is also a real need to develop diagnostic assays to identify the O antigens of E. coli strains and assays to identify the O antigens of S. enterica strains. With regard to the detection of E. coli these needs extend beyond EHFC (enteropathogenic haemorrhagic E. coli) strains but this is the area of greatest need. There is interest in diagnostics for ETEC (enterotoxigenic E. coli) etc in E. coli.

[0010] The first diagnostic systems employed in this field used large panels of antisera raised against E. coli O antigen expressing strains or S. enterica O antigen expressing strains. This technology has inherent difficulties associated with the preparation, storage and usage of the reagents, as well as the time required to achieve a meaningful diagnostic result.

[0011] Nucleotide sequences derived from the O antigen gene clusters of S. enterica strains have been used to determine S. enterica O antigens in a PCR assay [Luk, J. M. C. et al. (1993) “Selective amplification of abequose and paratose synthase genes (rfb) by polymerase chain reaction for identification of S. enterica major serogoups (A, B, C2, and D)”, J. Clin. Microbiol. 31:2118-2123 ]. The prior complete nucleotide sequence characterisation of the entire rfb locus of serovars Typhimurium, Paratyphi A, Typhi, Muenchen, and Anatum; representing groups B, A, D1, C2 and E1 respectively enabled Luk et al. to select oligonucleotide primers specific for those serogroups. Thus the approach of Luk et al. was based on aligning known nucleotide sequences corresponding to CDP-abequose and CDP-paratose synthesis genes within the O antigen regions of S. enterica serogroups E1, D1, A, B and C2 and exploiting the observed nucleotide sequence differences in order to identify serotype-specific oligonucleotides.

[0012] In an attempt to determine the O antigen serotype of a Shiga-like toxin producing E. coli strain, Paton, A. W., et al. 1996 [“Molecular microbiological investigation of an outbreak of Hemolytic-Uremic Syndrome caused by dry fermented sausage contaminated with Shiga-like toxin producing Escherichia coli”. J. Clin. Microbiol. 34: 1622-1627], used oligonucleotides derived from the wbdl (orf6) region, which were believed to be specific to the E. coli 0111 antigen and which were derived from E. coli 0111 sequence, in a PCR diagnostic assay. Unpublished reports indicate that the approach of Paton et al. is deficient in that the nucleotide sequences derived from wbdl may not specifically identify the 0111 antigen and in fact lead to detection of false positive results. Paton et al. disclose the detection of 5 0111 antigen isolates by PCR when in fact from only 3 of those isolates did they detect bacteria which reacted with 0111 specific antiserum.

DESCRIPTION OF THE INVENTION

[0013] Whilst not wanting to be held to a particular hypothesis, the present inventors now believe that the reported false positives found with the Paton et al. method are due to the fact that the nucleic acid molecules employed by Paton et al. were derived from genes which have a putative function as a sugar pathway gene, [Bastin D. A. and Reeves, P. R. (1995) Sequence and analysis of the O antigen gene(rfb) cluster of Escherichia coli 0111. Gene 164: 17-23] which they now believe to lack the necessary nucleotide sequence specificity to identify the E. coli O antigen. The inventors now believe that many of the nucleic acid molecules derived from sugar pathway genes expressed in L enterica or other enterobacteria are also likely to lack the necessary nucleotide sequence specificity to identify specific O antigens or specific serotypes.

[0014] In this regard it is important to note that the genes for the synthesis of a polysaccharide antigen include those related to the synthesis of the sugars present in the antigen (sugar pathway genes) and those related to the manipulation of those sugars to form the polysaccharide. The present invention is predominantly concerned with the latter group of genes, particularly the assembly and transport genes such as transferase, polymerase and flippase genes.

[0015] The present inventors have surprisingly found that the use of nucleic acid molecules derived from particular assembly and transport genes, particularly transferase, wzx and wzy genes, within O antigen gene clusters can improve the specificity of the detection and identification of O antigens. The present inventors believe that the invention is not necessarily limited to the detection of the particular O antigens which are encoded by the nucleic acid molecules exemplified herein, but has broad application for the detection of bacteria which express an O antigen and the identification of O antigens in general. Further because of the similarities between the gene clusters involved in the synthesis of O antigens and other polymorphic polysaccharide antigens, such as bacterial capsular antigens, the inventors believe that the methods and molecules of the present invention are also applicable to these other polysaccharide antigens.

[0016] Accordingly, in one aspect the present invention relates to the identification of nucleic acid molecules 3which are useful for the detection and identification of specific bacterial polysaccharide antigens.

[0017] The invention provides a nucleic acid molecule derived from: a gene encoding a transferase; or a gene encoding an enzyme for the transport or processing of a polysaccharide or oligosaccharide unit, including a wzx gene, wzy gene, or a gene with a similar function; the gene being involved in the synthesis of a particular bacterial polysaccharide antigen, wherein the sequence of the nucleic acid molecule is specific to the particular bacterial polysaccharide antigen.

[0018] Polysaccharide antigens, such as capsular antigens of E. coli (Type I and Type II), the Virulence capsule of S. enterica sv Typhi and the capsules of species such as Streptococcus pneumoniae and Staphylococcus albus are encoded by genes which include nucleotide sugar pathway genes, sugar transferase genes and genes for the transport and processing of the polysaccharide or oligosaccharide unit. In some cases these are wzx or wzy but in other cases they are quite different because a different processing pathway is used. Examples of other gene clusters include the gene clusters for an extracellular polysaccharide of Streptococcus thermothilus, an exopolysaccharide of Rhizobium melilotti and the K2 capsule of Klebsiella pneumoniae. These all have genes which by experimental analysis, comparison of nucleotide sequence or predicted protein structure, can be seen to include nucleotide sugar pathway genes, sugar transferase genes and genes for oligosaccharide or polysaccharide processing.

[0019] In the case of the E. coli K-12 colanic acid capsule gene cluster [Stevenson et al (1996) “Organization of the Escherichia coli K-12 gene cluster responsible for production of the extracellular polysaccharide colanic acid”. J. Bacteriol 178: 4885-4893] genes from the three classes were identified either provisionally or definitively. Colanic acid capsule is classified with the Type I capsule of E. coli.

[0020] The present inventors believe that, in general, transferase genes and genes for oligosaccharide processing will be more specific for a given capsule than the genes coding for the nucleotide sugar synthetic pathways as most sugars present in such capsules occur in the capsules of different serotypes. Thus the nucleotide sugar synthesis pathway genes could now be predicted to be common to more than one capsule type.

[0021] As elaborated below the present inventors recognise that there may be polysaccharide antigen gene clusters which share transferase genes and/or genes for oligosaccharide or polysaccharide processing so that completely random selection of nucleotide sequences from within these genes may still lead to cross-reaction; an example with respect to capsular antigens is provided by the E. coli type II capsules for which only transferase genes are sufficiently specific. However, the present inventors in light of their current results nonetheless consider the transferase genes or genes controlling oligosaccharide or polysaccharide processing to be superior targets for nucleotide sequence selection for the specific detection and characterisation of polysaccharide antigen types. Thus where there is similarity between particular genes, selection of nucleotide sequences from within other transferase genes or genes for oligosaccharide or polysaccharide processing from within the relevant gene cluster will still provide specificity, or alternatively the use of combinations of nucleotide sequences will provied the desired specificity. The combinations of nucleotide sequences may include nucleotide sequences derived from pathway genes together with nucleotide sequences derived from transferase, wzx or wzy genes.

[0022] Thus the invention also provides a panel of nucleic acid molecules wherein the nucleic acid molecules are derived from a combination of genes encoding transferases and/or enzymes for the transport or processing of a polysaccharide or oligosaccharide unit including wzx or wzy genes; wherein the combination of genes is specific to the synthesis of a particular bacterial polysaccharide antigen and wherein the panel of nucleic acid molecules is specific to a bacterial polysaccharied antigen. In another preferred form, the nucleic acid molecules are derived from a combination of genes encoding transferases and/or enzymes for the transport or processing of a polysaccharide or oligosaccharide unit including wzx or wzy genes, together with nucleic acid molecules derived from pathway genes.

[0023] In a second aspect the present invention relates to the identification of nucleic acid molecules which are useful for the detection of bacteria which express O antigens and for the identification of the O antigens of those bacteria in diagnostic assays.

[0024] The invention provides a nucleic acid molecule derived from: a gene encoding a transferase; or a gene encoding an enzyme for the transport or processing of a polysaccharide or oligosaccharide unit such as a wzx or wzy gene, the gene being involved in the synthesis of a particular bacterial O antigen, wherein the sequence of the nucleic acid molecule is specific to the particular bacterial O antigen.

[0025] The nucleic acids of the invention may be variable in length. In one embodiment they are from about 10 to about 20 nucleotides in length.

[0026] In one preferred embodiment, the invention provides a nucleic acid molecule derived from: a gene encoding a transferase; or a gene encoding an enzyme for the transport or processing of a polysaccharide or oligosaccharide unit including a wzx or wzy gene the gene being involved in the synthesis of an O antigen expressed by E. coli, wherein the sequence of the nucleic acid molecule is specific to the O antigen.

[0027] In one more preferred embodiment, the sequence of the nucleic acid molecule is specific to the nucleotide sequence encoding the 0111 antigen (SEQ ID NO:1). More preferably, the sequence is derived from a gene selected from the group consisting of wbdh (nucleotide position 739 to 1932 of SEQ ID NO:1), wzx (nucleotide position 8646 to 9911 of SEQ ID NO:1), wzy (nucleotide position 9901 to 10953 of SEQ ID NO:1), wbdM (nucleotide position 11821 to 12945 of SEQ ID NO:1) and fragments of those molecules of at least 10-12 nucleotides in length. Particularly preferred nucleic acid molecules are those set out in Table 5 and 5A, with respect to the above mentioned genes.

[0028] In another more preferred embodiment, the sequence of the nucleic acid molecule is specific to the nucleotide sequence encoding the 0157-antigen (SEQ ID NO:2). More preferably the sequence is derived from a gene selected from the group consisting of wbdN (nucleotide position 79 to 861 of SEQ ID NO:2), wbdO, (nucleotide position 2011 to 2757 ( SEQ ID NO:2), wbdP (nucleotide position 5257 to 6471 of SEQ ID NO:2)), wbdR (13156 to 13821 of SEQ ID NO:2), wzx (nucleotide position 2744 to 4135 of SEQ ID NO:2) and wzy (nucleotide position 858 to 2042 of SEQ ID NO:2). Particularly preferred nucleic acid molecules are those set out in Table 6 and 6A.

[0029] The invention also provides in a further preferred embodiment a nucleic acid molecule derived from: a gene encoding a transferase; or a gene encoding an enzyme for the transport or processing of a polysaccharide or oligosaccharide unit including a wzx or wzy gene; the gene being involved in the synthesis of an O antigen expressed by Salmonella enterica, wherein the sequence of the nucleic acid molecule is specific to the O antigen.

[0030] In one more preferred form of this embodiment, the sequence of the nucleic acid molecule is specific to the nucleotide sequence encoding the S. enterica C2 antigen (SEQ ID NO:3). More preferably the sequence of the nucleic acid molecule is derived from a gene selected from the group consisting of wbaR (nucleotide position 2352 to 3314 of SEQ ID NO:3), wbaL (nucleotide position 3361 to 3875 of SEQ ID NO:3), wbaQ (nucleotide position 3977 to 5020 of SEQ ID NO:3), wbaW (nucleotide position 6313 to 7323 of SEQ ID NO:3), wbaZ (nucleotide position 7310 to 8467 of SEQ ID NO:3), wzx (nucleotide position 1019 to 2359 of SEQ ID NO:3)and wzy (nucleotide position 5114 to 6313 of SEQ ID NO:3). Particularly preferred nucleic acid molecules are those set out in Table 7.

[0031] In another more preferred form of this embodiment, the sequence of the nucleic acid molecule is specific to the nucleotide sequence encoding the S. enterica B antigen (SEQ ID NO:4). More preferably the sequence is derived from wzx (nucleotide position 12762 to 14054 of SEQ ID NO:4) or wbav (nucleotide position 14059 to 15060 of SEQ ID NO:4). Particularly preferred nucleic acid molecules are those set out in Table 8 which are derived from wzx and wbaV genes.

[0032] In a further more preferred form of this embodiment, the sequence of the nucleic acid molecule is specific to the S. enterica D3 O antigen and is derived from the wzy gene.

[0033] In yet a further preferred form of this embodiment, the sequence of the nucleic acid molecule is specific to the S. enterica E1 O antigen and is derived from the wzx gene.

[0034] While transferase genes, or genes coding for the transport or processing of a polysaccharide or oligosaccharide unit, such as a wzx or wzy gene, are superior targets for specific detection of individual O antigen types there may well be individual genes or parts of them within this group that can be demonstrated to be the same or closely rplated,between different O antigen types such that cross-reactions can occur. Cross reactions should be avoided by the selection of a different target within the group or the use of multiple targets within the group.

[0035] Further, it is r ae.ognis,d that there are cases where O antigen gene clusters have arisen from recombination of at least two strains such that the unique O antigen type is provided by a combination of gene products shared with at least two other O antigen types. The recognised example of this phenomenon is the S. enterica O antigen serotype D2 which has genes from D1 and E1 but none unique to D2. In these circumstances the detection of the O antigen type can still be achieved in accordance with the invention, but requires the use of a combination of nucleic acid molecules to detect a specific combination of genes that exists only in that particular O antigen gene cluster.

[0036] Thus, the invention also provides a panel of nucleic acid molecules wherein the nucleic acid molecules are derived from genes encoding transferases and/or enzymes for the transport or processing of a polysaccharide or oligosaccharide unit including wzx or wzy genes, wherein the panel of nucleic acid molecules is specific to a bacterial O antigen. Preferably the particular bacterial O antigen is expressed by L enterica. More preferably, the panel of nucleic acid molecules is specific to the D2 O antigen and is derived from the E1 wzy gene and the D1 wzx gene.

[0037] The combinations of nucleotide sequences may include nucleotide sequences derived from pathway genes, together with nucleotide sequences derived from transferase, wzx or wzy genes.

[0038] Thus, the invention also provides a panel of nucleic acid molecules, wherein the nucleic acid molecules are derived from genes encoding transferases and/or enzymes for the transport or processing of a polysaccharide or oligosaccharide unit including wzx or wzy genes, and sugar pathway genes, wherein the panel of nucleic acid molecules is specific to a particular bacterial O antigen. Preferably the O antigen is expressed S. enterica.

[0039] Further it is recognised that there may be instances where spurious hybridisation will arise through initial selection of a sequence found in many different genes but this is typically recognisable by, for instance, comparison of band sizes against controls in PCR gels, and an alternative sequence can be selected.

[0040] The present inventors believe that based on the teachings of the present invention and available information concerning polysaccharide antigen gene clusters (including O antigen gene clusters), and through use of experimental analysis, comparison of nucleic acid sequences or predicted protein structures, nucleic acid molecules in accordance with the invention can be readily derived for any particular polysaccharide antigen of interest. Suitable bacterial strains can typically be acquired commercially from depositary institutions.

[0041] As mentioned above there are currently 166 defined E. coli O antigens while the S. enterica has 46 known O antigen types [Popoff M. Y. et al (1992) “Antigenic formulas of the Salmonella serovars” 6th revision WHO Collaborating centre for Reference and Research on Salmonella, Institut Pasteur Paris France]. Many other genera of bacteria are known to have O antigens and these include Citrobacter, Shigella, Yersinia, Plesiomonas, Vibrio and Proteus.

[0042] Samples of the 166 different E. coli O antigen serotypes are available from Statens Serum Institut, Copenhagen, Denmark.

[0043] The 46 S. enterica serotypes are available from Institute of Medical and Veterinary Science, Adelaide, Australia.

[0044] In another aspect, the invention relates to a method of testing a sample for the presence of one or more bacterial polysaccharide antigens comprising contacting the sample with at least one oligonucleotide molecule capable of specifically hybridising to: (i) a gene encoding a transferase, or (ii) a gene encoding an enzyme for transport or processing of oligosaccharide or polysaccharide units, including a wzx or wzy gene; wherein said gene is involved in the synthesis of the bacterial polysaccharide antigen; under conditions suitable to permit the at least one oligonucleotide molecule to specifically hybridise to at least one such gene of any bacteria expressing the particular bacterial polysaccharide antigen present in the sample and detecting any specifically hybridised oligonucleotide molecules.

[0045] Where a single specific oligonucleotide molecule is unavailable a combination of molecules hybridising specifically to the target region may be used. Thus the invention provides a panel of nucleic acid molecules for use in the method of testing of the invention, wherein the nucleic acid molecules are derived from genes encoding transferases and/or enzymes for the transport or processing of a polysaccharide or oligosaccharide unit including wzx or wzy genes, wherein the panel of nucleic acid molecules is specific to a particular bacterial polysaccharide. The panel of nucleic acid molecules can include nucleic acid molecules derived from sugar pathway genes where necessary.

[0046] In another aspect, the invention relates to a method of testing a sample for the presence of one or more bacterial polysaccharide antigens comprising contacting the sample with at least one pair of oligonucleotide molecules, with at least one oligonucleotide molecule of the pair capable of specifically hybridising to: (i) a gene encoding a transferase, or (ii) a gene encoding an enzyme for transport or processing oligosaccharide or polysaccharide units, including a wzx or wzy gene; wherein said gene is involved in the synthesis of the bacterial polysaccharide antigen; under conditions suitable to permit the at least one oligonucleotide molecule of the pair of molecules to specifically hybridise to at least one such gene of any bacteria expressing the particular bacterial polysaccharide antigen present in the sample and detecting any specifically hybridised oligonucleotide molecules.

[0047] The pair of oligonucleotide molecules may both hybridise to the same gene or to different genes. Only one oligonucleotide molecule of the pair need hybridise specifically to sequence specific for the particular antigen type. The other molecule can hybridise to a non-specific region.

[0048] Where the particular polysaccharide antigen gene cluster has arisen through recombination, the at least one pair of oligonucleotide molecules may be selected to be capable of hybridising to a specific combination of genes in the cluster specific to that polysaccharide antigen, or multiple pairs may be selected to provide hybridisation to the specific combination of genes. Even where all the genes in a particular cluster are unique, the method may be carried out using nucleotide molecules which recognise a combination of genes within the cluster.

[0049] Thus the invention provides a panel containing pairs of nucleic acid molecules for use in the method of testing of the invention, wherein the pairs of nucleic acid molecules are derived from genes encoding transferases and/or enzymes for the transport or processing of a polysaccharide or oligosaccharide unit including wzx or wzy genes, wherein the panel of nucleic acid molecules is specific to a particular bacterial polysaccharide antigen. The panel of nucleic acid molecules can include pairs of nucleic acid molecules derived from sugar pathway genes where necessary.

[0050] In another aspect, the invention relates to a method of testing a sample for the presence of one or more particular bacterial O antigens comprising contacting the sample with at least one oligonucleotide molecule capable of specifically hybridising to: (i) a gene encoding an O antigen transferase, or (ii) a gene encoding an enzyme for transport or processing of the oligosaccharide or polysaccharide unit, including a wzx or wzy gene; wherein said gene is involved in the synthesis of the particular O antigen; under conditions suitable to permit the at least one oligonucleotide molecule to specifically hybridise to at least one such gene of any bacteria expressing the particular bacterial O antigen present in the sample and detecting any specifically hybridised oligonucleotide molecules. Preferably the bacteria are E. coli or S. enterica. More preferably, the E. coli express the 0157 serotype or the 0111 serotype. More preferably the S. enterica express the C2 or B serotype. Preferably, the method is a Southern blot method. More preferably, the nucleic acid molecule is labelled and hybridisation of the nucleic acid molecule is detected by autoradiography or detection of fluorescence.

[0051] The inventors envisage circumstances where a single specific oligonucleotide molecule is unavailable. In these circumstances a combination of molecules hybridising specifically to the target region may be used. Thus the invention provides a panel of nucleic acid molecules for use in the method of testing of the invention, wherein the nucleic acid molecules are derived from genes encoding transferases and/or enzymes for the transport or processing of a polysaccharide or oligosaccharide unit including wzx or wzy genes, wherein the panel of nucleic acid molecules is specific to a particular bacterial O antigen. Preferably the particular bacterial O antigen is expressed by S. enterica. The panel of nucleic acid molecules can include nucleic acid molecules derived from sugar pathway genes where necessary.

[0052] In another aspect, the invention relates to a method of testing a sample for the presence of one or more particular bacterial O antigens comprising contacting the sample with at least one pair of oligonucleotide molecules with at least one oligonucleotide molecule of the pair being capable of specifically hybridising to: (i) a gene encoding an O antigen transferase, or (ii) a gene encoding an enzyme for transport or processing of the oligosaccharide or polysaccharide unit, including a wzx or wzy gene; wherein said gene is involved in the synthesis of the particular O antigen; under conditions suitable to permit the at least one oligonucleotide molecule to specifically hybridise to at least one such gene of any bacteria expressing the particular bacterial O antigen present in the sample and detecting any specifically hybridised oligonucleotide molecules.

[0053] Preferably the bacteria are E coli or S. enterica. More preferably, the E. coli are of the 0111 or the 0157 serotype. More preferably the S. enterica express the C2 or B serotype. Preferably, the method is a polymerase chain reaction method. More preferably the oligonucleotide molecules for use in the method of the invention are labelled. Even more preferably the hybridised oligonucleotide molecules are detected by electrophoresis. Preferred oligonucleotides for use with 0111 which provide for specific detection of 0111 are illustrated in Table 5 and 5A with respect to the genes wbdg, wzx, wzy and wbdm. Preferred oligonucleotide molecules for use with 0157 which provide for specific detection of 0157 are illustrated in Table 6 and 6A.

[0054] With respect to serotypes C2 and B, suitable oligonucleotide molecules can be selected from appropriate regions described in column 3 of Tables 7 and 8.

[0055] The inventors envisage rare circumstances whereby two genetically similar gene clusters encoding serologically different O antigens have arisen through recombination of genes or mutation so as to generate polymorphic variants. In these circumstances multiple pairs of oligonucleotides may be selected to provide hybridisation to the specific combination of genes. The invention thus provides a panel containing pairs of nucleic acid molecules for use in the method of testing of the invention, wherein the pairs of nucleic acid molecules are derived from genes encoding transferases and/or enzymes for the transport or processing of a polysaccharide or oligosaccharide unit including wzx or wzy genes, wherein the panel of nucleic acid molecules is specific to a particular bacterial O antigen. Preferably the particular bacterial O antigen is expressed by S enterica. The panel of nucleic acid molecules can include pairs of nucleic acid molecules derived from sugar pathway genes where necessary.

[0056] In another aspect, the invention relates to a method for testing a food derived sample for the presence of one or more particular bacterial O antigens comprising contacting the sample with at least one pair of oligonucleotide molecules with at least one oligonucleotide molecule of the pair being capable of specifically hybridising to: (i) a gene encoding an O antigen transferase, or (ii) a gene encoding an enzyme for transport or processing of the oligosaccharide or polysaccharide unit, including a wzx or wzy gene; wherein the gene is involved in the synthesis of the particular O antigen; under conditions suitable to permit the at least one oligonucleotide molecule to specifically hybridise to at least one such gene of any bacteria expressing the particular bacterial polysaccharide antigen present in the sample and detecting any specifically hybridised oligonucleotide molecules. Preferably the bacteria are E. coli or S. enterica. More preferably, the E. coli are of the 0111 or 0157 serotype. More preferably the L enterica are of the C2 or B serotype. Preferably, the method is a polymerase chain reaction method. More preferably the oligonucleotide molecules for use in the method of the invention are labelled. Even more preferably the hybridised oligonucleotide molecules are detected by electrophoresis.

[0057] In another aspect the present invention relates to a method for testing a faeal derived sample for the presence of one or more particutar bacterial O antigens comprising contacting the sample with at least one pair of oligonucleotide molecules with at least one oligonucleotide molecule of the pair being capable of specifically hybridising to: (i) a gene encoding an O antigen transferase, or (ii) a gene encoding an enzyme for transport or processing of the oligosaccharide or polysaccharide unit, including a wzx or wzy gene; wherein said gene is involved with synthesis of the particular O antigen; under conditions suitable to permit the at least one oligonucleotide molecule to specifically hybridise to at least one of said gene of any bacteria expressing the particular bacterial O aigen present in the sample and detecting any specifically hybridised oligonucleotide molecules. Preferably the bacteria are E. coli or S enterica. More preferably, the E. coli are of the 0111 or 0157 serotype. More preferably, the S. enterica are of the C2 or B serotype. Preferably, the method is a polymerase chain reaction method. More preferably the oligonucleotide molecules for use in the method of the invention are labelled. Even more preferably the hybridised oligonucleotide molecules are detected by electrophoresis.

[0058] In another aspect, the present invention relates to a method for testing a sample derived from a patient for the presence of one or more particutlar bacterial O antigens comprising contacting the sample with at least one pair of oligonucleotide molecuis with at least one oligonucleotide molecule of the pair being capable of specifically hybridising to: (i) a gene encoding an O antigen transferase, or (ii) a gene encoding an enzyme for transport or processing of the oligosaccharide or polysaccharide unit, including a wzx or wzy gene; wherein said gene is involved in the synthesis of the particular O antigen; under conditions suitable to permit the at least one oligonucleotide molecule to specifically hybridise to at least one such gene of any bacteria expressing the particular bacterial O antigen present in the sample and detecting any specifically hybridised oligonucleotide molecules. Preferably the bacteria are E. coli or S. enterica. More preferably, the E. coli are of the 0111 or 0157 serotype. More preferably, the S. enterica are of the C2 or B serotype. Preferably, the method is a polymerase chain reaction method. More preferably the oligonucleotide molecules for use in the method of the invention are labelled. Even more preferably the hybridised oligonucleotide molecules are detected by electrophoresis.

[0059] In the above described methods it will be understood that where pairs of oligonucleotides are used one of the oligonucleotide sequences may hybridise to a sequence that is not from a transferase, wzx or wzy gene. Further where both hybridise to one of these gene products they may hybridise to the same or a different one of these genes.

[0060] In addition it will be understood that where cross reactivity is an issue a combination of oligonucleotides may be chosen to detect a combination of genes to provide specificity.

[0061] The invention further relates to a diagnostic kit which can be used for the detection of bacteria which express bacterial polysaccharide antigens and the identification of the bacterial polysaccharide type of those bacteria.

[0062] Thus in a further aspect, the invention relates to a kit comprising a first vial containing a first nucleic acid molecule capable of specifically hybridising to: (i) a gene encoding a transferase, or (ii) a gene encoding an enzyme for transport or processing oligosaccharide or polysaccharide, including a wzx or wzy gene, wherein the said gene is involved in the synthesis of a bacterial polysaccharide. The kit may also provide in the same or a separate vial a second specific nucleic acid capable of specifically hybridisingto: (i) a gene encoding a transferase, or (ii) a gene encoding an enzyme for transport or processing oligosaccharide or polysaccharide, including a wzx or wzy gene, wherein the said gene is involved in the synthesis of a bacterial polysaccharide, wherein the sequence of the second nucleic acid molecule is different from the sequence of the first nucleic acid molecule.

[0063] In a further aspect the invention relates to a kit comprising a first vial containing a first nucleic acid molecule capable of specifically hybridising to: (i) a gene encoding a transferase, or (ii) a gene encoding an enzyme for transport or processing oligosaccharide or polysaccharide including wzx or wzy, wherein the said gene is involved in the synthesis of a bacterial O antigen. The kit may also provide in the same or a separate vial a second specific nucleic acid capable of specifically hybridising to: (i) a gene encoding a transferase , or (ii) a gene encoding an enzyme for transport or processing oligosaccharide or polysaccharide including wzx or wzy, wherein the said gene is involved in the synthesis of O antigen, wherein the sequence of the second nucleic acid molecule is different from the sequence of the first nucleic acid molecule. Preferably the first and second nucleic acid sequences are derived from E. coli or the first and second nucleic acid sequences are derived from S. enterica.

[0064] The present inventors provide full length sequence of the 0157 gene cluster for the first time and recognise that from this sequence of this previously uncloned full gene cluster appropriate recombinant molecules can be generated and inserted for expression to provide expressed 0157 antigens useful in applications such as vaccines.

DEFINITIONS

[0065] The phrase, “a nucleic acid molecule derived from a gene” means that the nucleic acid molecule has a nucleotide sequence which is either identical or substantially similar to all or part of the identified gene. Thus a nucleic acid molecule derived from a gene can be a molecule which is isolated from the identified gene by physical separation from that gene, or a molecule which is artificially,, svnthesised and has a nucleotide sequence which is either identical to or substantially similar to all or part of the identified gene. While some workers consider only the DNA strand with the same sequence as the mRNA transcribed from the gene, here either strand is intended.

[0066] Transferase genes are regions of nucleic acid which have a nucleotide sequence which encodes gene products that transfer monomeric sugar units.

[0067] Flippase or wzx genes are regions of nucleic acid which have a nucleotide sequence which encodes a gene product that flips oligosaccharide repeat units generally composed of three to six monomeric sugar units to the external surface of the membrane.

[0068] Polymerase or wzy genes are regions of nucleic acid which have a nucleotide sequence which encodes gene products that polymerise repeating oligosaccharide units generally composed of 3-6monomeric sugar units.

[0069] The nucleotide sequence provided in this specification are described in the sequence listing as anti-sense sequences. This term is used in the same manner as it is used in Glassary of Biochemistry and Molecular Biology Revised Edition, David M. Glick, 1997 Portland Press Ltd., London on page 11 where the term is described as referring to one of the two strands of double-stranded DNA usually that which has the same sequence as the mRNA. We use it to describe this strand which has the same sequence as the mRNA. NOMENCLATURE Synonyms for E. coli O111 rfb Current names Our names Bastin et al. 1991 wbdH orf1 gmd orf2 wbdI orf3 orf3.4* manC orf4 rfbM* manB orf5 rfbK* wbdJ orf6 orf6.7* wbdK orf7 orf7.7* wzx orf8 orf8.9 and rfbX* wzy orf9 wbdL orf10 wbdM orf11

[0070] Other Synonyms wzy rfc wzx rfbX rmlA rfbA rmlB rfbB rmlC rfbC rmlD rfbD glf orf6* wbbI orf3#, orf8* of E. coli K-12 wbbJ orf2#, orf9* of E. coli K-12 wbbK orf1#, orfl0* of E. coli K-12 wbbL orf5#, orf 11* of E. coli K-12

BRIEF DESCRIPTION OF DRAWINGS

[0071]FIG. 1 shows Eco R1 restriction maps of cosmid clones pPR1054, pPR1O55, pPR1056, pPR1058, pPR1287 which are subclones of E. coli 0111 O antigen gene cluster. The thickened line is the region common to all clones Broken lines show segments that are non-contiguous on the chromosome. The deduced restriction map for E. coli strain M92 is shown above.

[0072]FIG. 2 shows a restriction mapping analysis of E. coli 0111 O antigen gene cluster within the cosmid clone pPR1058. Restriction enzymes are: (B: BamHl; Bg: BglII, E: EcoR1; H: HindIII; K: KpnI; P: PstI; S: Sall and X: Xhol. Plasmids pPR1230, pPR1231, and pPR1288 are deletion derivatives of pPR1058. Plasmids pPR 1237, pPR1238, pPR1239 and pPRl240 are in pUC19. Plasmids pPR1243, pPR1244, pPR1245, pPR1246 and pPR1248 are in pUC18, and pPR1292 is in pUC19. Plasmid pPR1270 is in pT7T319U. Probes 1, 2 and 3 were isolated as internal fragments of pPR1246, pPR1243 and pPR1237 respectively. Dotted lines indicate that subclone DNA extends to the left of the map into attached vector.

[0073]FIG. 3 shows the structure of E. coli 0111 O antigen gene cluster.

[0074]FIG. 4 shows the structure of E. coli 0157 O antigen gene cluster.

[0075]FIG. 5 shows the structure S. enterica locus encoding the serogroup C2O antigen gene cluster.

[0076]FIG. 6 shows the structure S. enterica locus encoding the serogroup B O antigen gene cluster.

[0077]FIG. 7 shows the nucleotide sequence of the E. coli 0111 O antigen gene cluster. Note: (1) The first and last three bases of a gene are underlined and of italic respectively.; (2) The region which was previously sequenced by Bastin and Reeves 1995 “Sequence and anlysis of the O antigen gene (rfb) cluster of Eschericia coli 0111” Gene 164: 17-23 is marked.

[0078]FIG. 8 shows the nucleotide sequence of the E. coli 0157 O antigen gene cluster. Note: (1) The first and last three bases of a gene (region) are underlined and of italic respectively (2) The region previously sequenced by Bilge et al. 1996 “Role of the Eschericia coli 0157-H7O side chain in adherence and analysis of an rfb locus”. Inf. and Immun 64:4795-4801 is marked.

[0079]FIG. 9 shows the nucleotide sequence of S. enterica serogroup C2O antigen gene cluster. Note: (1) The numbering is as in Brown et al. 1992. “Molecular analysis of the rfb gene cluster of Salmonella serovar muenchen (strain M67): the genetic basis of the polymorphism between groups C2 and B”. Mol. Microbiol. 6: 1385-1394(2) The first and last three bases of a gene are underlined and in italics respectively.(3) Only that part of the group C2 gene cluster, which differs from that of group B, was sequenced and is presented here.

[0080]FIG. 10 shows the nucleotide sequence of S. enterica serogroup B O antigen gene cluster Note: (1) The numbering is as in Jiang et al. 1991. “Structure and sequence of the rfb (O antigen) gene cluster of Salmonella serovar typhimurium (strain LT2)”. Mol. Microbiol. 5: 695-713. The first gene in the O antigen gene cluster is rmlB which starts at base 4099. (2) The first and last three bases of a gene are underlined and in italics respectively.

BEST METHOD FOR CARRYING OUT THE INVENTION

[0081] Materials and Methods-part 1

[0082] The experimental procedures for the isolation and characterisation of the E. coli 0111 O antigen gene cluster (position 3,021-9,981) are according to Bastin D. A., et al. 1991 “Molecular cloning and expression in Escherichia coli K-12 of the rfb gene cluster determining the O antigen of an E. coli 0111 strain”. Mol. Microbiol. 5:9 2223-2231 and Bastin D. A. and Reeves, P. R. 1995 “Sequence and analysis of the O antigen gene(rfb)cluster of Escherichia coli 0111”. Gene 164: 17-23.

[0083] A. Bacterial strains and growth media

[0084] Bacteria were grown in Luria broth supplemented as required.

[0085] B. Cosmids and phage

[0086] Cosmids in the host-strain x2819 were repackaged in vivo. Cells were grown in 250mL flasks containing 30 mL of culture, with moderate shaking at 30° C. to an optical density of 0.3 at 580 nm. The defective lambda prophage was induced by heating in a water bath at 45° C. for 15min followed by an incubation at 37° C. with vigorous shaking for 2hr. Cells were then lysed by the addition of 0.3 mL chloroform and shaking for a further 10 min. Cell debris were removed from 1 mL of lysate by a 5 min spin in a microcentrifuge, and the supernatant removed to a fresh microfuge tube. One drop of chloroform was added then shaken vigorously through the tube contents.

[0087] C. DNA preparation

[0088] Chromosomal DNA was prepared from bacteria grown overnight at 37° C. in a volume of 30 mL of Luria broth. After harvesting by centrifugation, cells were washed and resuspended in 10 mL of 50 mMris-HCl pH 8.0. EDTA was added and the mixture incubated for 20 min. Then lysozyme was added and incubation continued for a further 10 min. Proteinase K, SDS, and ribonuclease were then added and the mixture incubated for up to 2 hr for lysis to occur. All incubations were at 37° C. The mixture was then heated to 65° C. and extracted once with 8 mL of phenol at the same temperature. The mixture was extracted once with 5 mL of phenol/chloroform/iso-amyl alcohol at 4° C. Residual phenol was removed by two ether extractions. DNA was precipitated with 2 vols. of ethanol at 4° C., spooled and washed in 70% ethanol, resuspended in 1-2 mL of TE and dialysed. Plasmid and cosmid DNA was prepared by a modification of the Birnboim and Doly method [Birnboim, H. C. And Doly, J. (1979)]. A rapid alkaline extraction procedure for screening recombinant plasmid DNA Nucl. Acid Res. 7:1513-1523. The volume of culture was 10 mL and the lysate was extracted with phenol/chloroform/iso-amyl alcohol before precipitation with isopropanol. Plasmid DNA to be used as vector was isolated on a continuous caesium chloride gradient following alkaline lysis of cells grown in 1L of culture.

[0089] D. Enzymes and buffers

[0090] Restriction endonucleases and DNA T4 ligase were purchased from Boehringer Mannheim (Castle Hill, NSW, Australia)or Pharmacia LKB (Melbourne, VIC Australia). Restriction enzymes were used in the recommended commercial buffer.

[0091] E. Construction of a gene bank.

[0092] Individual aliquots of M92 chromosomal DNA (strain Stoke W, from Statens Serum Institut, 5 Artillerivej, 2300 Copenhagen S, Denmark) were partially digested with 0.2U Sau3A1 for 1-15 mins. Aliquots giving the greatest proportion of fragments in the size range of approximately 40-50 kb were selected and ligated to vector pPR691 previously digested with BamHl and PvuII. Ligation mixtures were packaged in vitro with packaging extract. The host strain for transduction was x2819 and recombinants were selected with kanamycin.

[0093] F. Serological procedures.

[0094] Colonies were screened for the presence of the 0111 antigen by immunoblotting. Colonies were grown overnight, up to 100 per plate then transferred to nitrocellulose discs and lysed with 0.5N HCl. Tween 20 was added to TBS at 0.05% final concentration for blocking, incubating and washing steps. Primary antibody was E. coli O group 111 antiserum, diluted 1:800. The secondary antibody was goat anti-rabbit IgG labelled with horseradish peroxidase diluted 1:5000. The staining substrate was 4-chloro-1-napthol. Slide agglutination was performed according to the standard procedure.

[0095] G. Recombinant DNA methods.

[0096] Restriction mapping was based on a combination of standard methods including single and double digests and sub-cloning. Deletion derivatives of entire cosmids were produced as follows: aliquots of 1.8 μg of cosmid DNA were digested in a volume of 20 μl with 0.25U of restriction enzyme for 5-80 min. One half of each aliquot was used to check the degree of digestion on an agarose gel. The sample which appeared to give a representative range of fragments was ligated at 4° C. overnight and transformed by the CaCl₂ method into JM109. Selected plasmids were transformed into sf174 by the same method. P4657 was transformed with pPR1244 by electroporation.

[0097] H. DNA hybridisation

[0098] Probe DNA was extracted from agarose gels by electroelution and was nick-translated using [α-32P]-dCTP. Chromosomal or plasmid DNA was electrophoresed in 0.8% agarose and transferred to a nitrocellulose membrane. The hybridisation and pre-hybridisation buffers contained either 30% or 50% formamide for low and high stringency probing respectively. Incubation temperatures were 42° C. and 37° C. for pre-hybridisation and hybridisation respectively. Low stringency washing of filters consisted of 3×20 min washes in 2 x SSC and 0.1% SDS. High-stringency washing consisted of 3×5 min washes in 2 x SSC and 0.1% SDS at room temperature, a 1hr wash in 1 x SSC and 0.1% SDS at 58° C. and 15 min wash in 0.1 x SSC and 0.1% SDS at 58° C.

[0099] I. Nucleotide Sequencing of E. coli 0111 O Antigen Gene Cluster (position 3,021-9,981)

[0100] Nucleotide sequencing was performed using an ABI 373 automated sequencer (CA, USA). The region between map positions 3.30 and 7.90 was sequenced using uni-directional exonuclease III digestion of deletion families made in PT7T3190 from clones pPRI270 and pPR1272. Gaps were filled largely by cloning of selected fragments into M13mpl8 or M13mpl9. The region from map positions 7.90-10.2 was sequenced from restriction fragments in Ml3mpl8 or M13mpl9. Remaining gaps in both the regions were filled by priming from synthetic oligonucleotides complementary to determined positions along the sequence, using a single stranded DNA template in M13 or phagemid The oligonucleotides were designed after analysing the adjacent sequence. All sequencing was performed by the chain termination method. Sequences were aligned using SAP [Staden, R., 1982 “Automation of the computer handling of gel reading data produced by the shotgun method of DNA sequencing”. Nuc. Acid Res. 10: 4731-4751; Staden, R., 1986 “The current status and portability of our sequence handling software”. Nuc. Acid Res. 14: 217-231]. The program NIP [Staden, R. 1982 “An interactive graphics program for comparing and aligning nucleic acid and amino acid sequence”. Nuc. Acid Res. 10: 2951-2961] was used to find open reading frames and translate them into proteins. J. Isolation of clones.carrying E. coli 0111 0 antigen gene cluster.

[0101] The E. coli O antigen gene cluster was isolated according to the method of Bastin D. A., et al. [1991 “Molecular cloning and expression in Escherichia coli K-12 of the rfb gene cluster determining the O antigen of an E. coli 0111 strain”. Mol. Microbiol. 5(9), 2223-2231]. Cosmid gene banks of M92 chromosomal DNA were established in the in vivo packaging strain x2819. From the genomic bank, 3.3×10³ colonies were screened with E. coli 0111 antiserum using an immuno-blotting procedure: 5 colonies (pPR1054, pPRiO55, p#R-lD561, pOPR1058 and pPR1287) were positive. The cosmids from these strains were packaged in vivo into lambda particles and transduced into the E. coli deletion mutant Sθ174 which lacks all O antigen genes. In this host strain, all plasmids gave positive agglutination with 0111 antiserum. An Eco P1 restriction map of the 5 independent cosmids showed that they have a region of approximately 11.5 kb in common (FIG. 1). Cosmid pPR1058 included sufficiant flanking DNA to identify several chromosomal markers linked to O antigen gene cluster and was selected for analysis of the O antigen gene cluster region.

[0102] K. Restriction mapping of cosmid pPR1O58

[0103] Cosmid pPR1058 was mapped in two stages. A preliminary map was constructed first, and then the region between map positions 0.00 and 23.10 was mapped in detail, since it was shown to be sufficient for 0111 antigen expression. Restriction sites for both stages are shown in FIG. 2. The region common to the five cosmid clones was between map positions 1.35 and 12.95 of pPRl058.

[0104] To locate the O antigen gene cluster within pPRl058, pPR1058 cosmid was probed with DNA probes covering O antigen gene cluster flanking regions from S. enterica LT2 and E. coli K-12. Capsular polysaccharide (cps) genes lie upstream of O antigen gene cluster while the gluconate dehydrogenase (gnd) gene and the histidine (his) operon are downstream, the latter being further from the O antigen gene cluster. The probes used were pPR472 (3.35kb), carrying the gnd gene of LT2, pPR68S (5.3kb) carrying two genes of the cps cluster, cpsb and cpsG of LT2, and K350 (16.5kb) carrying all of the his operon of K-12. Probes hybridised as follows: pPR472 hybridised to 1.55kb and 3.5 kb (including 2.7 kb of vector) fragments of Pstl and HindIII double digests of pPR1246 (a HindIII/EcoRl subclone derived from pPRl058, FIG. 2), which could be located at map positions 12.95-15.1; pPR685 hybridized to a 4.4 kb EcoRl fragment of pPRl058 (including 1.3 kb of vector) located at map position 0.00-3.05; and K350 hybridised with a 32 kb EcoRl fragment of pPR1058 (including 4.0 kb of vector), located at map position 17.30-45.90. Subclones containing the presumed gnd region complemented a gndedd strain GB23152. On gluconate bromothymol blue plates, pPRI244 and pPR1292 in this host strain gave the green colonies expected of a gnd edd genotype. The his⁺phenotype was restored by plasmid pPR1058 in the his deletion strain Sf174 on minimal medium plates, showing that the plasmid carries the entire his operon.

[0105] It is likely that the O antigen gene cluster region lies between gnd and cps, as in other E. coli and S. enterica strains, and hence between the approximate map positions 3.05 and 12.95. To confirm this, deletion derivatives of pPR1058 were made as follows: first, pPR1058 was partially digested with HindIII and self ligated. Transformants were selected for kanamycin resistance and screened for expression of 0111 antigen. Two colonies gave a positive reaction. EcoRl digestion showed that the two colonies hosted identical plasmids, one of which was designated pPR1230, with an insert which extended from map positions 0.00 to 23.10. Second pPR1O58 was digested with Sall and partially digested with Xhol and the compatible ends were re-ligated. Transformants were selected with kanamycin and screened for 0111 antigen expression. Plasmid DNA of 8 positively reacting clones was checked using EcoRl and Xhol digestion and appeared to be identical. The cosmid of one was designated pPR1231. The insert of pPR1231 contained the DNA region between map positions 0.00 and 15.10. Third, pPR1231 was partially digested with Xhol, self-ligated, and transformants selected on spectinomycin/streptomycin plates. Clones were screened for kanamycin sensitivity and of 10 selected, all had the DNA region from the Xhol site in the vector to the Xhol site at position 4.00 deleted. These clones did not express the 0111 antigen, showing that the Xhol site at position 4.00 is within the O antigen gene cluster. One clone was selected and named pPR1288. Plasmids pPR1230, pPR1231, and pPR1288 are shown in FIG. 2.

[0106] L. Analysis of the E. coli 0111 0 antigen gene cluster (position 3,021-9,981) nucleotide sequence data Bastin and Reeves [1995 “Sequence and analysis of the O antigen gene(rfb)cluster of Escherichia coli 0111”. Gene 164: 17-23] partially characterised the E. coli 0111 O antigen gene cluster by sequencing a fragment from map position 3,021-9,981. FIG. 3 shows the gene organisation of position 3,021-9,981 of E. coli 0111 O antigen gene cluster. orf3 and orf6 have high level amino acid identity with wcaH and wcaG (46.3% and 37.2% respectively), and are likely to be similar in function to sugar biosynthetic pathway genes in the E. coli K-12 colanic gene cluster. orf4 and orf5 show high levels of amino acid homology to manC and manb genes respectively. orf7 shows high level homology with rfbH which is an abequose pathway gene. orf8 encodes a protein with 12 transmembrane segments ad has similarity in secondary structure to other wzx ,ens and is likely therefore to be the O antigen flippa segene.

[0107] Materials-and Methods-12art 2

[0108] A. Nucleotide sequencing of 1 to 3,020 and 9,982 to 14,516 of the E. coli 0111 O antigen gene cluster

[0109] The sub clones which contained novel nucleotide sequences, pPR1231 .position 12,007 and 1,510), pPR1237 (map position -300 to 2,744), pPR1239 (map position 2,744 to 4,168), pPR1245 (map,,position 9,736 to 12,007) and pPR1246 (map posit,o 12,Quito 15,300) (FIG. 2), were characterised as follows: the distal ends of the inserts of pPR1237, pPR1239 an pPR1245 were sequenced using the M13 forward and reverse primers located in the vector. PCR walking was carried out to sequence further into each insert using primers based on the sequence data and the primers were tagged with M13 forward or reverse primer sequences for sequencing. This PCR walking procedure was repeated until the entire insert was sequenced. pPR1246 was characterised from position 12,007 to 14,516. The DNA of these sub clones was sequenced in both directions. The sequencing reactions were performed using the dideoxy termination method and thermocycling and reaction products were analysed using flouorescent dye and an ABI automated sequencer (CA, USA).

[0110] B. Analysis of the E. coli 0111 O antigen gene cluster (positions 1 to 3,020 and 9,982 to 14,516 of SEQ ID NO:1) nucleotide sequence data.

[0111] The gene organ of regions of E. coli 0111 O antigen gene cluster which were not characterised by Bastin and Reeves [1995 “Squence and analysis of the O antigen gene(rfb)cluster of Escherichia coli 0111.”Gene 164: 17-23],(positions 1 to 3,020 and 9,982 to 14,516) is shown in FIG. 3. Thereare two open reading frames in region 1. Four open reading frames are predicted in region 2. The position of each gene is listed in Table 5.

[0112] The deduced amino acid sequence of orfl (wbdh) shares about 64% similarity with that of the rfp gene of Shiaella dysenteriae. Rfp and WbdH have very similar hydrophobicity plots and both have a very convincing predicted transmembrane segment in a corresponding position. rfp is a galactosyl transferase involved in the synthesis of LPS core, thus wbdh is likely to be a galactosyl transferase gene. orf2 has 85.7% identity at amino acid level to the gmd gene identified in the E. coli K-12 colanic acid gene cluster and is likely to be a gmd gene. orf9 encodes a protein with 10 predicted transmembrane segments and a large cytoplasmic loop. This inner membrane topology is a characteristic feature of all known O antigen polymerases thus it is likely that orf9 encodes an O antigen polymerase gene, wzy. orf10 (wbdL) has a deduced amino acid sequence with low homology with Lsi2 of Neisseria gonorrhoeae. Lsi2 is responsible for adding GlcNAc to galactose in the synthesis of lipooligosaccharide. Thus it is likely that wbdL is either a colitose or glucose transferase gene. orf11 (wbdM) shares high level nucleotide and amino acid similarity with TrsE of Yersinia enterocholitica. TrsE is a putative sugar transferase thus it is likely that wbdM encodes the colitose or glucose transferase.

[0113] In summary three putative transferase genes and an O antigen polymerase gene were identified at map position 1 to 3,020 and 9,982 to 14,516 of E. coli 0111 O antigen gene cluster. A search of GenBank has shown that there are no genes with significant similarity at the nucleotide sequence level for two of the three putative transferase genes or the polymerase gene. SEQ ID NO:1 and FIG. 7 provide the nucleotide sequence of the 0111 antigen gene cluster.

[0114] Materials and Methods-part 3

[0115] A. PCR amplification of O157 antigen gene cluster from an E. coli 0157:H7 strain (Strain C664-1992, from Statens Serum Institut, 5 Artillerivej, 2300, Copenhagen S, Denmark).

[0116]E. coli 0157 O antigen gene cluster was amplified by using long PCR [Cheng et al. 1994, Effective amplification of long targets from cloned inserts and human and genomic DNA P.N.A.S. USA 91: 5695-569] with one primer (primer #412: att ggt agc tgt aag cca agg gcg gta gcg t) based on the JumpStart sequence usually found in the promoter region of O antigen gene clusters [Hobbs, et al. 1994 “The JumpStart sequence: a 39 bp element common to several polysaccharide gene clusted” Mol. Microbiol. 12: 855-856], and another primer #482 (cac tgc cat acc gac gac gcc gat ctg ttg ctt gg) based on the gnd gene usually found downstream of the O antigen gene cluster. Long PCR was carried out using the Expand Long Template PCR System from Boehringer Mannheim (Castle Hill NSW Australia), and products, 14 kb in length, from several reactions were combined and purified using the Promega Wizard PCR preps DNA purification System (Madison Wis. USA). The PCR product was then extracted with phenol and twice with ether, precipitated with 70% ethanol, and resuspended in 40 μL of water.

[0117] B. Construction of a random DNase I bank:

[0118] Two aliquots containing about 150 ng of DNA each were subjected to DNase I digestion using the Novagen DNase I Shotgun Cleavage (Madison Wis. USA) with a modified protocol as described. Each aliquot was diluted into 45 μl of 0.05 M Tris -HCl (pH7.5), 0.05 mg/mL BSA and 10 mM MnCl₂5 μL of 1:3000 or 1:4500 dilution of DNaseI (Novagen) (Madison Wis. USA) in the same buffer was added into each tube respectively and 10μl of stop buffer (100mM EDTA), 30% glycerol, 0.5% Orange G, 0.075% xylene and cyanol (Novagen) (Madison Wis. USA) was added after incubation at 15° C. for 5 min. The DNA from the two DNaseI reaction tubes were then combined and fractionated on a 0.8% LMT agarose gel, and the gelsegment with DNA of about 1 kb in size (about 1.5 mL agarose) was excised. DNA was extracted from agarose using Promega Wizard PCR Preps DNA Purification (Madison Wis. USA) and resuspended in 200 μL water, before being extracted with phenol and twice with ether, and precipitated. The DNA was then resuspended in 17.25 μL water and subjected to T4 DNA polymerase repair and single dA tailing using the Novagen Single dA Tailing Kit (Madison Wis. USA). The reaction product (85 μl containing about 8 ng DNA) was then extracted with chloroform:isoamyl alcohol (24:1) once and ligated to 3×10⁻³ pmol PGEM-T (Promega) (Madison Wis. USA) in a total volume of 100 μL. Ligation was carried out overnight at 4° C. and the ligated DNA was precipitated and resuspended in 20 μL water before being electroporated into E. coli strain JM109 and plated out on BCIG-IPTG plates to give a bank.

[0119] C. Sequencing

[0120] DNA templates from clones of the bank were prepared for sequencing using the 96-well format plasmid DNA miniprep kit from Advanced Genetic Technologies Corp (Gaithersburg Md. USA) The inserts of these clones were sequenced from one or both ends using the standard M13 sequencing primer sites located in the pGEM-T vector. Sequencing was carried out on an ABI377 automated sequencer (CA USA) as described above, after carrying out the sequencing reaction on an ABI Catalyst (CA USA). Sequence gaps and areas of inadequate coverage were PCR amplified directly from 0157 chromosomal DNA using primers based on the already obtained sequencing data and sequenced using the standard M13 sequencing primer sites attached to the PCR primers.

[0121] D. Analysis of the E. coli 0157 0 antigen gene cluster nucleotide sequence data

[0122] Sequence data were processed and analysed using the Staden programs [Staden, R., 1982 “Automation of the computer handling of gel reading data produced by the shotgun method of DNA sequencing.” Nuc. Acid Res. 10: 4731-4751; Staden, R., 1986 “The current status and portability of our sequence handling software”. Nuc. Acid Res. 14: 217-231; Staden, R. 1982 “An interactive graphics program for comparing and aligning nucleic acid and amino acid sequence”. Nuc. Acid Res. 10: 2951-2961]. FIG. 4 shows the structure of E. coli 0157 O antigen gene cluster. Twelve open reading frames were predicted from the sequence data, and the nucleotide and amino acid sequences of all these genes were then used to search the GenBank database for indication of possible function and specificity of these genes. The position of each gene is listed in Table 6. The nucleotide sequence is presented in SEQ ID NO:2 and FIG. 8.

[0123] orfs 10 and 11 showed high level identity to manC and manB and were named manC and manb respectively. orf7 showed 89% identity (at amino acid level) to the gmd gene of the E. coli colanic acid capsule gene cluster (Stevenson G., K. et al. 1996 “Organisation of the Escherichia coli K-12 gene cluster responsible for production of the extracellular polysaccharide colanic acid”.J. Bacteriol. 178:4885-4893) and was named gmd. orf8 showed 79% and 69% identity (at amino acid level) respectively to wcaG of the E. coli colanic acid capsule gene cluster and to wbcJ (orfl4.8) gene of the Yersinia enterocolitica 08 O antigen gene cluster (Zhang, L. et al. 1997 “Molecular and chemical characterization of the lipopolysaccharide O-antigen and its role in the virulence of Y. enterocolitica serotype 08”Mol. Microbiol. 23:63-76). Colanic acid and the Yersinia 08 O antigen both contain fucose as does the 0157 O antigen. There are two enzymatic steps required for GDP-L-fucose synthesis from GDP-4-keto-6-deoxy-D-mannose, the product of the gmd gene product. However, it has been shown recently (Tonetti, M et al. 1996 Synthesis of GDP-L-fucose by the human FX protein J. Biol. Chem. 271:27274-27279) that the human FX protein has “significant homology” with the wcaG gene (referred to as Yefb in that paper), and that the FX protein carries out both reactions to convert GDP-4-keto-6-deoxy-D-mannose to GDP-L-fucose. We believe that this makes a very strong case for orf8 carrying out these two steps and propose to name the gene fcl. In support of the one enzyme carrying out both functions is the observation that there are no genes other than manB, manC, gmd and fcl with similar levels of similarity between the three bacterial gene clusters for fucose containing structures. orf5 is very similar to wbeE (rfbE) of Vibrio cholerae 01, which is thought to be the perosamine synthetase, which converts GDP-4-keto-6-deoxy-D-mannose to GDP-perosamine (Stroeher, U.H et al. 1995 “A putative pathway for perosamine biosynthesis is the first function encoded within the rfb region of Vibrio choleraef”01. Gene 166: 33-42). V. cholerae Ol and E. coli 0157 O antigens contain perosamine and N-acetyl-perosamine respectively. The V. cholerae O1 manA, manB, gmd and wbeE genes are the only genes of the V. cholerae O1 gene cluster with significant similarity to genes of the E. coli 0157 gene cluster and we believe that our observations both confirm the prediction made for the function of wbe of V. cholerae, and show that orf5 of the 0157 gene cluster encodes GDP-perosamine synthetase. orf5 is therefore named per. orf5 plus about 100 bp of the upstream region (postion 4022-5308)was previously sequenced by Bilge, S.S. et al. [1996 “Role of the Escherichia coli 0157-H7 O side chain in adherence and analysis of an rfb locus”.Infect. Immun. 64:4795-4801].

[0124] orf12 shows high level similarity to the conserved region of about 50 amino acids of various members of an acetyltransferase family (Lin, W., et al. 1994 “Sequence analysis and molecular characterisation of genes required for the biosynthesis of type 1 capsular polysaccharide in Staphylococcus aureus”. J. Bateriol. 176: 7005-7016) and we believe it is the N-acetyltransferase to convert GDP-perosamine to GDP-perNAc. orfl2 has been named wbdr.

[0125] The genes manB, manC, gmd, fcl, per and wbdr account for all ofthe expected biosynthetic pathway genes of the 0157 gene cluster.

[0126] The remaining biosynthetic step(s) required are for synthesis of UDP-GalNAc from UDP-Glc. It has been proposed (Zhang, L., et al. 1997 “Molecular and chemical characterisation of the lipopolysaccharide O antigen and its role in the virulence of Yersinia enterocolitica serotype 08”.Mol. Microbiol. 23:63-76) that in Yersinia enterocolitica UDP-GalNAc is synthesised from UDP-GlcNAc by a homologue of galactose epimerase (GalE), for which there is a galE like gene in the Yersinia enterocolitica 08 gene cluster. In the case of 0157 there is no galE homologue in the gene cluster and it isnot clear how UDP-GalNAc is synthesised. It is possible that the galactose epimerase encoded by the galE gene in the gal operon, can carry out conversion of UDP-GlcNAc to UDP-GalNAc in addition to conversion of UDP-Glc to UDP-Gal. There do not appear to be any gene(s) responsible for UDP-GalNAc synthesis in the 0157 gene cluster.

[0127] orf4 shows similarity to many wzx genes and is named wzx and orf2 which shows similarity of secondary structure in the predicted protein to other wzy genes and is for that reason named wzy.

[0128] The orfl, orf3 and orf6 gene products all have characteristics of transferases, and have been named wbdN, wbdO and wbdp respectively. The 0157 O antigen has 4 sugars and 4 transferases are expected. The first transferase to act would put a sugar phosphate onto undecaprenol phosphate. The two transferases known to perform this function, WbaP (RfbP) and WecA (Rfe) transfer galactose phosphate and N-acetyl-glucosamine phosphate respectively to undecaprenol phosphate. Neither of these sugars is present in the 0157 structure.

[0129] Further, none of the presumptive transferases in the 0157 gene cluster has the transmembrane segments found in WecA and WbaP which transfer a sugar phosphate to undecaprenol phosphate and expected for any protein which transferred a sugar to undecaprenol phosphate which is embedded within the membrane.

[0130] The WecA gene which transfers GlcNAc-P to undecaprenol phosphate is located in the Enterobactereal Common Antigen (ECA) gene cluster and it functions in ECA synthesis in most and perhaps all E. coli strains, and also in O antigen synthesis for those strains which have GlcNAc as the first sugar in the O unit.

[0131] It appears that WecA acts as the transferase for addition of GalNAc-1-P to undecaprenol phosphate for the Yersinia enterocolitica 08 O antigen [Zhang et al.1997 Molecular and chemical characterisation of the lipopolysaccharide O antigen and its role in the virulence of Yersinia enterocoiitica serotype 08Mol. Microbiol. 23: 63-76.] and perhaps does so here as the 0157 structure includes GalNAc. WecA has also been reported to add Glucose-1-P phosphaecaprenol phosphate in E. coli 08 and 09 strains, and an alternative possibility for transfer of the first sugar to undecaprenol phosphate is WecA mediated transfer of glucose, as there is a glucose residue in the 0157 O antigen. In either case the requisite number of transferase genes are present if GalNAc or Glc is transferred by WecA and the side chain Glc is transferrd by a transferase outside of the O antigen gene cluster.

[0132] orf9 shows high level similarity (44% identity at amino acid level, sa engt) with wcah gene of the E. coli colanic acid capsule gene cluster. The function of this gene is unknown, and we give off orf9 the name wbdQ.

[0133] The DNA between manB and wdbR has strong sequence similarity to one of the H-repeat units of E. coli K12. Both of the inverted repeat sequences flanking this region are still recognisable, each with two of the 11 bases being changed. The H-repeat associated protein encoding gene located within this region has a 267 base deletion and mutations in various positions. It seems that the H-repeat unit has been associated with this gene cluster for a long period of time since it translocated to the gene cluster, perhaps playing a role in assembly of the gene cluster ashas been proposed in other cases.

[0134] Materials and Methods - part 4

[0135] To test our hypothesis that O antigen genes for transferases and the wzx, wzy genes were more specific than pathway genes for diagnostic PCR, we first carried out PCR using primers for all the E. coli 016 0 antigen genes (Table 4). The PCR was then carried out using PCR primers for E. coli 0111 transferase, wzx and wzy genes (Table 5, 5A). PCR was also carried out using PCR primers for the E. coli 0157 transferase, wzx and wzy genes (Table 6, 6A).

[0136] Chromosomal DNA from the 166 serotypes of E. coli available from Statens Serum Institut, 5 Artillerivej, 2300 Copenhagen Denmark was isolated using the Promega Genomic (Madison Wis. USA) isolation kit. Note that 164 of the serogroups are described by Ewing W. H.: Edwards and Ewings “Identification of the Enterobacteriacea” Elsevier, Amsterdam 1986 and that they are numbered 1-171 with numbers 31, 47, 67, 72, 93, 94 and 122 no longer valid. Of the two serogroup 19 strains we used 19ab strain F8188-41. Lior H. 1994 [“Classification of Eschericia coli In Eschericia coli in domestic animals and humans pp 31-72. Edited by C. L. Gyles CAB international] adds two more numbered 172 and 173 to give the 166 serogroups used. Pools containing 5 to 8 samples of DNA per pool were made. Pool numbers 1 to 19 (Table 1) were used in the E. coli 0111 and 0157 assay. Pool numbers 20 to 28 were also used in the 0111 assay, and pool numbers 22 to 24 contained E. coli 0111 DNA and were used as positive controls (Table 2). Pool numbers 29 to 42 were also used in the 0157 assay, and pool numbers 31 to 36 contained E. coli 0157 DNA, and were used as positive controls (Table 3). Pool numbers 2 to 20, 30, 43 and 44 were used in the E. coli 016 assay (Tables 1 to 3). Pool number 44 contained DNA of E. coli K-12 strains C600 and WGI and was used as a positive control as between them they have all of the E. coli K-12 016 O antigen genes.

[0137] PCR reactions were carried out under the following conditions: denaturing 94° C./30”; annealing, temperature varies (refer to Tables 4 to 8)/30″; extension, 72° C./1′; 30 cycles. PCR reaction was carried out in an volume of 25μL for each pool. After the PCR reaction, 10μL PCR product from each pool was run on an agarose gel to check for amplified DNA.

[0138] Each E. coli and S. enterica chromosomal DNA sample was checked by gel electrophoresis for the presence of chromosomal DNA and by PCR amplification of the E. coli or S. enterica mdh gene using oligonucleotides based on E. coli K-12 or Salmonella enterica LT2 [Boyd et al. (1994) “Molecular genetic basis of allelic polymorphism in malate degydrogenase (mdh) in natural populations of Escherichia coli and Salmonella enterica” Proc. Nat. Acad. Sci. USA. 91:1280-1284.] Chromosomal DNA samples from other bacteria were only checked by gel electrophoresis of chromosomal DNA.

[0139] A. Primers based on E. coli 016 0 antigen gene cluster sequence.

[0140] The O antigen gene cluster of E. coli 016 was the only typical E. coli O antigen gene cluster that had been fully sequenced prior to that of 0111, and we chose it for testing our hypothesis. One pair of primers for each gene was tested against pools 2 to 20, 30 and 43 of E. coli chromosomal DNA. The primers, annealing temperatures and functional information for each gene are listed in Table 4.

[0141] For the five pathway genes, there were 17/21, 13/21, 0/21, 0/21, 0/21 positive pools for rmlB, rmlD, rmlA, rmlC and glf respectively (Table 4). For the wzx, wzy and three transferase genes there were no positives amongst the 21 pools of E. coli chromosomal DNA tested (Table 4). In each case the #44 pool gave a positive result.

[0142] B Primers based on the E. coli 0111 0 antigen gene clsuter sequence.

[0143] One to four pairs of primers for each of the transferase, wzx and wzy genes of 0111 were tested against the pools 1 to 21 of E. coli chromosomal DNA (Table 5). For wbdh, four pairs of primers, which bind to various regions of this gene, were tested and found to be specific for 0111 as there was no amplified DNA of the correct size in any of those 21 pools of E. coli chromosomal DNA tested. Three pairs of primers for wbdM were tested, and they are all specific although primers #985/#986 produced a band of the wrong size from one pool. Three pairs of primers for wzx were tested and they all were specific. Two pairs of primers were betlted for wzy, both are specific although #980/#983 gave a band of the wrong size in all pools. One pair of primers for wbdL was tested and found unspecific and therefore further test was carried out. Thus, wzx, wzy and two of the three transferase genes are highly specific to 0111. Bands of the wrong size found in amplified DNA are assumed to be due to chance hybridisation of genes widely present in E. coli. The primers, annealing temperatures and positions for each gene are in (Table 5).

[0144] The 0111 assay was also performed using pools including DNA from O antigen expressing Yersinia Dseudotuberculosis, Shiaella boydii and Salmonella enterica strains (Table 5A). None of the oligonucleotides derived from wbdH, wzx, wzy or wbdM gave amplified DNA of the correct size with these pools. Notably, pool number 25 ncludes S. enterica Adelaide which has the same O antigen as E. coli 0111: this pool did not give a positive PCR result for any primers tested indicating that these genes are highly specifiic for E. coli 0111.

[0145] Each of the 12 pairs binding to wbdH, wzx, wzy and wbdM produces a bind o predict ed size with the pools containing 0111 DNA (pool number 22 to 24). As pools 22 to 24 included DNA from all strains present in pool 21 plus 0111 strain DNA (Table 2), we conclude that the 12 pairs of primers all give a positive PCR test with each of three unrelated 0111 strains but not with any other strains tested. Thus these genes are highly specific for E. coli 0111.

[0146] C. Primers based on the E. coli 0157 O antigen gene cluster sequence.

[0147] Two or three primer pairs for each of the transferase, wzx and wzy genes of 0157 were tested against E. coli chromosomal DNA of pools 1 to 19, 29 and 30 (Table 6). For wbdN, three pairs of primers, which bind to various regions of this gene, were tested and found to be specific for 0157 as there was no amplified DNA in any of those 21 pools of E. coli chromosomal DNA tested. Three pairs of primers for wbdo were tested, and they are all specific although primers #1211/#1212 produced two or three bands of the wrong size from all pools. Three pairs of primers were tested for wbdP and they all were specific. Two pairs of primers were tested for wbdr and they were all specific. For wzy, three pairs of primers were tested and all were specific although primer pair #1203/#1204 produced one or three bands of the wrong size in each pool. For wzx, two pairs of primers were tested and both were specific although primer pair #1217/#1218 produced 2 bands of wrong size in 2 pools, and 1 band of wrong size in 7 pools. Bands of the wrong size found in amplified DNA are assumed to be due to chance hybridisation of genes widely present in E. coli. The primers, annealing temperatures and function information for each gene are in Table 6.

[0148] The 0157 assay was also performed using pools 37 to 42, including DNA from O antigen expressing Yersinia oseudotuberculosis, Shiaella boydii, Yersinia enterocolitica 09, Brucella abortus and Salmonella enterica strains (Table 6A). None of the oligonucleotides derived from wbdN, wzy, wbdO, wzx, wbdP or wbdr reacted specifically with these pools, except that primer pair #1203/#1204 produced two bands with Y. enterocolitica 09 and one of the bands is of the same size with that from the positive control. Primer pair #1203/#1204 binds to wzy. The predicted secondary structures of Wzy proteins are generally similar, although there is very low similarity at amino acid or DNA level among the sequenced wzy genes. Thus, it is possible that Y. enterolcolitica 09 has a wzy gene closely related to that of E. coli 0157. It is also possible that this band is due to chance hybridization of another gene, as the other two wzy primer pairs (#1205/#1206 and #1207/#1208) did not produce any band with Y. enterocolitica 09. Notably, pool number 37 includes S. enterica Landau which has the same O antigen as E. coli 0157, and pool 38 and 39 contain DNA of B. abortus and Y. enterocolitica 09 which cross react serologically with E. coli 0157. This result indicates that these genes are highly 0157 specific, although one primer pair may have cross reacted with Y. enterocolitica 09.

[0149] Each of the 16 pairs binding to wbdn, wzx, wzy, wbdO, wbdP and wbdr produces a band of predicted size with the pools containing 0157 DNA (pools number 31 to 36). As pool 29 included DNA from all strains present in pools 31 to 36 other than 0157 strain DNA (Table 3), we conclude that the 16 pairs of primers all give a positive PCR test with each of the five unrelated 0157 strains.

[0150] Thus PCR using primers based on genes wbdN, wzy, wbdO, wzx, wbdp and wbdr is highly specific for E. coli 0157, giving positive results with each of six unrelated 0157 strains while only one primer pair gave a band of the expected size with one of three strains with O antigens known to cross-react serologically with E. coli 0157.

[0151] D. Primers based on the Salmonella enterica serotype C2 and B O antigen gene cluster sequences.

[0152] We also performed a PCR using primers for the S. enterica C2 and B serogroup transferases, wzx, wzy and genes (Tables 7 to 9). The nucleotide sequences of C2 and B O antigen gene clusters are listed as SEQ ID NO: 3 (FIG. 9) and SEQ ID NO:4 (FIG. 10) respectively. Chromosomal DNA from all the 46 serotypes of Salmonella enterica (Table 9) was isolated using the Promega Genomic isolation kit, 7 pools of 4 to 8 samples per pool were made. Salmonella enterica serotype B or C2 DNA was omitted from the pool for testing primers of 46 respective serotypes but added to a pool containing 6 other samples to give pool number 8 for use as a positive control.

[0153] PCR reactions were carried out under the following conditions: denaturing, 94° C./30″; annealing, temperature varies (see below)/30″; extension, 72° C./1′; 30 cycles. PCR reaction was carried out in a volume of 25μL for each pool. After the PCR reaction, lORL PCR product from each pool was run on an agarose gel to check for amplified DNA. For pools which gave a band of correct size, PCR was repeated using individual chromosomal samples of that pool, and agarose gel was run to check for amplified DNA from each sample.

[0154] The Salmonella enterica serotype B O antigen gene cluster (of strain LT2) was the first O antigen gene cluster to be fully sequenced, and the function of each gene has been identified experimentally [Jiang, X. M., Neal, B., Santiago, F., Lee, S. J., Romana, L. K., and Reeves, P. R. (1991) “Structure and sequence of the rfb (O antigen) gene cluster of Salmonella serovar typhimurium (strain LT2).” Mol. Microbiol. 5(3), 695-713; Liu, D., Cole, R., and Reeves, P. R. (1996). “An O antigen processing function for Wzx(RfbX): a promising candidate for O-unit flippase” J. Bacteriol., 178(7),2102-2107; Liu, D., Haase, A. M., Lindqvist, L., Lindberg, A. A., and Reeves, P. R. (1993). “Glycosyl transferases of O-antigen biosynthesis in S. enterica:identification and characterisation of transferase genes of groups B, C2 and E1.” J. Bacteriol., 175, 3408-3413; Liu, D., Lindquist, L., and Reeves P. R. (1995). “Transferases of O-antigen biosynthesis in Salmonella enterica:dideoxhexosyl transferases of groups B and C2 and acetyltransferase of group C2. ” J. Bacteriol. 177, 4084-4088; Romana, L. K., Santiago, F S., and Reeves, P. R. (1991). “High level expression and purification dThymidine-diphospho-D-glucose 4,6 dehydratase (rfbB) from Salmonella serovar typhimurium LT2.” BBRC, 174, 846-852]. One pair of primers for each of the pathway genes and wbaP was tested against the pools of Salmonella enterica DNA, two to three pairs of primers for each of the other transferases and wzx genes were also tested. See Table 8 for a list of primers and functional information of each gene, as well as the annealing temperature of the PCR reaction for each pair of primers.

[0155] For pathway genes of group B strain LT2, there are 19/45, 14/45, 15/45, 12/45, 6/45, 6/45, 6/45, 6/45, 1/45, 9/45, 8/45 positives for rmlB, zmlD, rmlA, rmlC, ddhd, ddhA, ddhb, ddhc, abe, manC, and manb repsectively (Table For the LT2 wzx gene we used three primer pairs each of which gave 1/45 positive. For the 4 transferase genes we used a total of 9 primer pairs. 2 primer pairs for wbaV gave 2/90 positives. For 3 primer pairs of wban, 11/135 gave a positive result. For the wbaP primer pair 10/45 gave a positive result (Table 9).

[0156] The experimental data show that oligonucleotides derived from the wzx and wbaV group B O antigen genes are specific for group B O antigen amongst all 45 Salmonella enterica O antigen groups except Ogroup 67. The oligonucleotides derived from Salmonella enterica B group wbaN and wbaU genes detected B group O antigen and also produced positive results with groups A, D1 and D3. WbaU encodes a transferase for a Mannose ao(1-4) Mannose linkage and is expressed in groups A, B and D1 while wbaN, which encodes a transferase for Rhamnose a(1-3) Galactose linkage is present in groups A, B, D1, D2, D3 and E1. This accounts for the positive results with the group B wbau and wbaN genes. The wbaN gene of groups E and D2 has considerable sequence differences from that of groups A, B, D1 and D3 and this accounts for the positive results only with groups B, D1 and D3.

[0157] The Salmonella enterica B primers derived from wzx and transferase genes produced a positive result with Salmonella enterica 067. We find that Salmonella enterica 067 has all the genes of the group B O antigen cluster. There are several possible explanations for this finding including the possibility that the gene cluster is not functional due to mutation and the group 067 antigenicity is due to another antigen, or the O antigen is modified after synthesis such that its antigenicity is changed. Salmonella enterica 067 would therefore be scored as Salmonella enterica group B in the PCR diagnostic assay. However, this is of little importance because Salmonella enterica 067 is a rare O antigen and only one (serovar Crossness) of the 2324 known serovars has the 067 serotype [Popoff M. Y. et al (1992) “Antigenic formulas of the Salmonella enterica serovars” 6th revision WHO Collaborating Centre for Reference and Research on Salmonella enterica, Institut Pasteur Paris France], and serovar Crossness had only been isolated once [M. Popoff, personal communication].

[0158] The Salmonella enterica B primers derived from wbaP reacted with group A, C2, D1, D2, D3, E1, 54, 55, 67 and E4 0 antigen groups. WbaP encodes the galactosyl transferase which initiates O unit synthesis by transfer of Galactose phosphate to the lipid carrier Undecaprenol phosphate. This reaction is common to the synthesis of several O antigens. As such wbaP is distinguished from other transferases of the invention as it does not make a linkage within an O antigen.

[0159] We also tested 20 primer pairs for the wzx, wzy and 5 transferase genes of serotype C2 and found no positives in all the 7 pools (Table 7).

[0160] Groups A, B, D1, D2, D3, C2 and E1 share many genes in common. Some of these genes occur with more than one sequence in which case each specific sequence can be named after one of the serogroups in which it occurs. The distribution of these sequence specificities is shown in Table 10. The inventors have aligned the nucleotide sequences of Salmonella enterica wzy, wzx genes and transferase genes so as to determine specific combinations of nucleic acid molecules which can be employed to specifically detect and identify the Salmonella enterica groups A, B, D1, D2, D3, C2 and E1 (Table 10). The results show that many of the O antigen groups can be detected and identified using a single specific nucleic acid molecule although other groups in particular D2 and E1, and A and D1 require a panel of nucleic acid molecules derived from a combination of genes.

[0161] It will be understood that in carrying out the methods of the invention with respect to the testing of particular sample types including samples from food, patients and faeces the samples are prepared by routine techniques routinely used in the preparation of such samples for DNA based testing. TABLE 1 Pool No. Strains of which chromosonal DNA included in the pool Source* 1 E. coli type strains for O serotypes 1, 2, 3, 4, 10, 16, 18 and 39 IMVS^(a) 2 E. coli type strains for O serotypes 40, 41, 48, 49, 71, 73, 88 and 100 IMVS 3 E. coli type strains for O serotypes 102, 109, 119, 120, 121, 125, 126 and IMVS 137 4 E. coli type strains for O serotypes 138, 139, 149, 7, 5, 6, 11 and 12 IMVS 5 E. coli type strains for O serotypes 13, 14, 15, 17, l9ab, 20, 21 and 22 IMVS 6 E. coli type strains for O serotypes 23, 24, 25, 26, 27, 28, 29 and 30 IMVS 7 E. coli type strains for O serotypes 32, 33, 34, 35, 36, 37, 38 and 42 IMVS 8 E. coli type strains for O serotypes 43, 44, 45, 46, 50, 51, 52 and 53 IMVS 9 E. coli type strains for O serotypes 54, 55, 56, 57, 58, 59, 60 and 61 IMVS 10 E. coli type strains for O serotypes 62, 63, 64, 65, 66, 68, 69 and 70 IMVS 11 E. coli type strains for O serotypes 74, 75, 76, 77, 78, 79, 80 and 81 IMVS 12 E. coli type strains for O serotypes 82, 83, 84, 85, 86, 87, 89 and 90 IMVS 13 E. coli type strains for O serotypes 91, 92, 95, 96, 97, 98, 99 and 101 IMVS 14 E. coli type strains for O serotypes 103, 104, 105, 106, 107, 108 and 110 IMVS 15 E. coli type strains for O serotypes 112, 162, 113, 114, 115, 116, 117 and IMVS 118 16 E. coli type strains for O serotypes 123, 165, 166, 167, 168, 169, 170 and See b 171 17 E. coli type strains for O serotypes 172, 173, 127, 128, 129, 130, 131 and See c 132 18 E. coli type strains for O serotypes 133, 134, 135, 136, 140, 141, 142 and IMVS 143 19 E. coli type strains for O serotypes 144, 145, 146, 147, 148, 150, 151 and IMVS 152

[0162] TABLE 2 Pool No. Strains of which chromosonal DNA included in the pool Source* 20 E. coli type strains for O serotypes 153, 154, 155, 156, 157, 158, 159 and IMVS 160 21 E. coli type strains for O serotypes 161, 163, 164, 8, 9 and 124 IMVS 22 As pool #21, plus E. coli 0111 type strain Stoke W. IMVS 23 As pool #21, plus E. coil 0111:H2 strain C1250-1991 See d 24 As pool #21, plus E. coli 0111:H12 strain C156-1989 See e 25 As pool #21, plus S. enterica serovar Adelaide See f 26 Y. pseudotuberculosis strains of O groups IA, IIA, IIB, IIC, III, IVA, IVB, See g VA, VB, VI and VII 27 S. boydii strains of serogroups 1, 3, 4, 5, 6, 8, 9, 10, 11, 12, 14 and 15 See h 28 S. enterica strains of serovars (each representing a different O group) Typhi, IMVS Montevideo, Ferruch, Jangwani, Raus, Hvittingfoss, Waycross, Dan, Dugbe, Basel, 65,:i:e,n,z,15 and 52:d:e,n,x,z15

[0163] TABLE 3 Pool No. Strains of which chromosonal DNA included in the pool Source* 29 E. coli type strains for O serotypes 153, 154, 155, 156, 158, 159 and 160 IMVS 30 E. coli type strains for O serotypes 161, 163, 164, 8, 9, 111 and 124 IMVS 31 As pool #29, plus E. coli O157 type strain A2 (O157:H19) IMVS 32 As pool #29, plus E. coli O157:H16 strain C475-89 See d 33 As pool #29, plus E. coli O157:H45 strain C727-89 See d 34 As pool #29, plus E. coli O157:H2 strain C252-94 See d 35 As pool #29, plus E. coli O157:H39 strain C258-94 See d 36 As pool #29, plus E. coli O157:H26 See e 37 As pool #29, plus S. enterica serovar Landau See f 38 As pool #29, plus Brucella abortus See g 39 As pool #29, plus Y. enterocolitica O9 See h 40 Y. pseudotuberculosis strains of O groups IA, IIA, IIB, IIC, III, IVA, IVB, VA, See i VB, VI and VII 41 S. boydii strains of serogroups 1, 3, 4, 5, 6, 8, 9, 10, 11, 12, 14 and 15 See j 42 S. enterica strains of serovars (each representing a different O group) Typhi, IMVS Montevideo, Ferruch, Jangwani, Raus, Hvittingfoss, Waycross, Dan, Dugbe, Basel, 65:i:e,n,z15 and 52:d:e,n,x,z15 43 E. coli type strains for O serotypes 1, 2, 3, 4, 10, 18 and 29 IMVS 44 As pool #43, plus E. coli K-12 strains C600 and WG1 IVMS See k

[0164] TABLE 4 PCR assay result using primers based on the E. coli serotype O16 (strain K-12) O antigen gene cluster sequence Number of pools Base Length of (out of 21) giving Annealing positions of Forward primer Reverse primer the PCR band of correct temperature Gene Function the gene (base positions) (base positions) fragment size of the PCR rmlB* TDP-rhamnose pathway  90-1175 #1064(91-109) #1065(1175-1157) 1085 bp 17 60° C. rmlD* TDP-rhamnose pathway 1175-2074 #1066(1175-1193) #1067(2075-2058)  901 bp 13 60° C. rmlA* TDP-rhamnose pathway 2132-3013 #1068(2131-2148) #1069(3013-2995)  883 bp  0 60° C. rmlC* TDP-rhamnose pathway 3013-3570 #1070(3012-3029) #1071(3570-3551)  559 bp  0 60° C. gtf* Galactofuranose pathway 4822-5925 #1074(4822-4840) #1075(5925-5908) 1104 bp  0 55° C. wzx* Flippase 3567-4814 #1072(3567-3586) #1073(4814-4797) 1248 bp  0 55° C. wzy* O polymerase 5925-7091 #1076(5925-5944) #1077(7091-7074) 1167 bp  0 60° C. wbbl* Galactofuranosyl 7094-8086 #1078(7094-7111) #1079(8086-8069)  993 bp  0 50° C. transferase wbbJ* Acetyltransferase 8067-8654 #1080(8067-8084) #1081(8654-8632)  588 bp  0 60° C. wbbK** Glucosyl transferase 5770-6888 #1082(5770-5787) #1083(6888-6871) 1119 bp  0 55° C. wbbL*** Rhamanosyltransferase  679-1437 #1084(679-697) #1085(1473-1456)  795 bp    0**** 55° C.

[0165] TABLE 5 PCR assay data using 0111 primers Base positions of Number of pools Annealing the gene according Forward primer Reverse primer Length of the (out of 21) giving temperature Gene to SEQ ID NO: 1 (base positions) (base positions) PCR fragment band of correct size of the PCR wbdH  739-1932  #866(739-757)  #867(1941-1924) 1203 bp 0 60° C.  #976(925-942)  #978(1731-1714)  807 bp 0 60° C.  #976(925-942)  #979(1347-1330)  423 bp 0 60° C.  #977(1165-1182)  #978(1731-1714)  567 bp 0 60° C. wzx  8646-9911  #969(8646-8663)  #970(9908-9891) 1263 bp 0 50° C. #1060(8906-8923) #1062(9468-9451)  563 bp 0 60° C. #1061(9150-9167) #1063(9754-9737)  605 bp 0 50° C. wzy  9901-10953  #900(9976-9996)  #901(10827-10807)  852 bp 0 60° C.  #980(10113-10130)  #983(10484-10467)  372 bp  0* 61° C. wbdL 10931-11824  #870(10931-10949)  #871(11824-11796)  894 bp 7 60° C. wbdM 11821-12945  #868(11821-11844)  #869(12945-12924) 1125 bp 0 60° C.  #984(12042-12059)  #987(12447-12430)  406 bp 0 60° C.  #985(12258-12275)  #986(12698-12681)  441 bp  0** 65° C.

[0166] TABLE 5A PCR specificity test data using 0111 primers Base positions Number of pools of the gene (pools no. 25-28) Annealing according to Forward primer Reverse primer Length of the giving band Temperature Gene SEQ ID NO: 1 (base positions) (base positions) PCR Fragment of correct size of the PCR wbdH  739-1932  #866(739-757)  #867(1941-1924) 1203 bp  0* 60° C.  #976(925-942)  #978(1731-1714)  807 bp 0 60° C.  #976(925-942)  #979(1347-1330)  423 bp 0 60° C.  #977(1165-1182)  #978(1731-1714)  567 bp 0 60° C. wzx  8646-9911  #969(8646-8663)  #970(9908-9891) 1263 bp 0 55° C. #1060(8906-8923) #1062(9468-9451)  563 bp 0 60° C. #1061(9150-9167) #1063(9754-9737)  605 bp  0* 50° C. wzy  9901-10953  #900(9976-9996)  #901(10827-10807)  852 bp 0 60° C.  #980(10113-10130)  #983(10484-10467)  372 bp  0** 60° C. wbdL 10931-11824  #870(10931-10949)  #871(11824-11796)  894 bp 0 60° C. wbdM 11821-12945  #868(11821-11844)  #869(12945-12924) 1125 bp 0 60° C.  #984(12042-12059)  #987(12447-12430)  406 bp 0 60° C.  #985(12258-12275)  #986(12698-12681)  441 bp  0* 65° C.

[0167] TABLE 6 PCR results using primers based on the E. coli 0157 sequence Base position Number of pools of the gene Length of (out of 21) Annealing according to Forward primer Reverse primer the PCR giving band of Temperature Gene Function SEQ ID NO: 2 (base position) (base position) fragment correct size of the PCR wbdN Sugar transferase   79-861 #1197(79-96) #1198(861-844) 783 0 55° C. #1199(184-201) #1200(531-514) 348 0 55° C. #1201(310-327) #1202(768-751) 459 0 55° C. wzy O antigen  858-2042 #1203(858-875) #1204(2042-2025) 1185  0* 50° C. #1205(1053-1070) #1206(1619-1602) 567 0 63° C. #1207(1278-1295) #1208(1913-1896) 636 0 60° C. wbdO Sugar transferase  2011-2757 #1209(2011-2028) #1210(2757-2740) 747 0 50° C. #1211(2110-2127) #1212(2493-2476) 384  0** 62° C. #1213(2305-2322) #1214(2682-2665) 378 0 60° C. wzx O antigen flippase  2744-4135 #1215(2744-2761) #1216(4135-4118) 1392 0 50° C. #1217(2942-2959) #1218(3628-3611) 687   0*** 63° C. wbdP Sugar transferase  5257-6471 #1221(5257-5274) #1222(6471-6454) 1215 0 55° C. #1223(5440-5457) #1224(5973-5956) 534 0 55° C. #1225(5707-5724) #1226(6231-6214) 525 0 55° C. wbdR N-acetyl transferase 13156-13821 #1229(13261-13278) #1230(13629-13612) 369 0 55° C. #1231(13384-13401) #1232(13731-13714) 348 0 60° C.

[0168] TABLE 6A PCR results using primers based on the E. coli 0157 sequence Base position Number or pools of the gene Length of (pools no. 37-42) Annealing according to Forward primer Reverse primer the PCR giving band of temperature Gene Function SEQ ID NO: 2 (base positions) (base positions) fragment correct size of the PCR wbdN Sugar transferase   79-861 #1197(79-96) #1198(861-844) 783  0* 55° C. #1199(184-201) #1200(531-514) 348  0* 55° C. #1201(310-327) #1202(768-751) 459 0 61° C. wzy O antigen polymerase  858-2042 #1203(858-875) #1204(2042-2025) 1185  1** 50° C. #1205(1053-1070) #1206(1619-1602) 567   0*** 60° C. #1207(1278-1295) #1208(1913-1896) 636 0 60° C. wbdO Sugar transferase  2011-2757 #1209(2011-2028) #1210(2757-2740) 747 0 50° C. #1211(2110-2127) #1212(2493-2476) 384   0**** 61° C. #1213(2305-2322) #1214(2682-2665) 378 0 60° C. wzx O antigen flippase  2744-4135 #1215(2744-2761) #1216(4135-4118) 1392 0 50° C. #1217(2942-2959) #1218(3628-3611) 687 0 63° C. wbdP Sugar transferase  5257-6471 #1221(5257-5274) #1222(6471-6454) 1215 0 55° C. #1223(5440-5457) #1224(5973-5956) 534  0* 60° C. #1225(5707-5724) #1226(6231-6214) 525 0 55° C. wbdR N-acetyl transferase 13156-13821 #1229(13261-13278) #1230(13629- 369 0 50° C. #1231(13384-13401) #1232(13731- 348 0 60° C.

[0169] TABLE 7 PCR assay data using primers based on the Salmonella enterica serotype C2 (strain M67) O antigen gene cluster sequence Base positions Number of of the gene Length of pools (out of 7) Annealing according to Forward primer Reverse primer the PCR giving band of temperature Gene Function SEQ ID NO: 3 (base position) (base position) fragment correct size of the PCR wzx Flippase 1019-2359 #1144(1019-1036) #1145(1414-1397) 396 bp 0 55° C. #1146(1708-1725) #1147(2170-2153) 463 bp 0 55° C. #1148(1938-1955) #1149(2356-2339) 419 bp 0 55° C. wbaR Abequosyl transferase 2352-3314 #1150(2352-2369) #1151(2759-2742) 408 bp 0 55° C. #1152(2601-2618) #1153(3047-3030) 447 bp 0 55° C. #1154(2910-2927) #1155(3311-3294) 402 bp 0 55° C. wbaL Acetyl transferase 3361-3875 #1156(3361-3378) #1157(3759-3742) 399 bp 0 55° C. #1158(3578-3595) #1159(3972-3955) 395 bp 0 50° C. wbaQ Rhamnosyl 3977-5020 #1160(3977-3994) #1161(4378-4361) 402 bp 0 55° C. #1162(4167-4184) #1163(4774-4757) 608 bp 0 55° C. #1164(4603-4620) #1165(5017-5000) 415 bp  0* 60° C. wzy O polymerase 5114-6313 #1166(5114-5131) #1167(5515-5498) 402 bp  0** 55° C. #1168(5664-5681) #1169(6112-6095) 449 bp 0 55° C. #1170(5907-5924) #1171(6310-6293) 404 bp 0 55° C. wbaW Mannosyl transferase 6313-7323 #1172(6313-6330) #1173(6805-6788) 493 bp 0 50° C. #1174(6697-6714) #1175(7068-7051) 372 bp 0 55° C. #1176(6905-6922) #1177(7320-7303) 416 bp 0 55° C. wbaZ Mannosyl transferase 7310-8467 #1178(7310-7327) #1179(7775-7758) 466 bp 0 50° C. #1180(7530-7547) #1181(7907-7890) 378 bp 0 55° C. #1182(8007-8024) #1183(8464-8447) 458 bp 0 55° C.

[0170] TABLE 8 PCR primers based on the Salmonella enterica serotype B (strain LT2) O antigen gene cluster sequence Base position of the gene Length of Annealing according to Forward primer Reverse primer the PCR temperature of Gene Function SEQ ID NO: 4 (base position) (base position) fragment the PCR rmlB TDP-rhamnose pathway  4099-5184 #1094(4100-4117) #1095(4499-4482) 400 bp 55° C. rmlD TDP-rhamnose pathway  5184-6083 #1092(5186-5203) #1093(5543-5526) 358 bp 50° C. rmlA TDP-rhamnose pathway  6131-7009 #1090(6531-6548) #1091(6837-6820) 308 bp 55° C. rmlC TDP-rhamnose pathway  7010-7561 #1088(7013-7030) #1089(7372-7355) 360 bp 55° C. ddhD CDP-abequose pathway  7567-8559 #1112(7567-7584) #1113(7970-7953) 404 bp 55° C. ddhA CDP-adequose pathway  8556-9329 #1114(8556-8573) #1115(8975-8958) 420 bp 60° C. ddhB CDP-adequose pathway  9334-10413 #1116(9334-9351) #1117(9816-9799) 483 bp 45° C. ddhC CDP-adequose pathway 10440-11753 #1118(10440-10457) #1119(10871-10854) 432 bp 60° C. abe CDP-adequose pathway 11781-12680 #1100(12008-12025) #1101(12388-12371) 381 bp 55° C. wzx Flippase 12762-14054 #1120(12762-12779) #1121(13150-13133) 389 bp 55° C. #1122(12993-13010) #1123(13417-13400) 425 bp 55° C. #1124(13635-13652) #1125(14051-14034) 417 bp 55° C. wbaV Abequosyl transferase 14059-15060 #1126(14059-14076) #1127(14421-14404) 363 bp 45° C. #1128(14688-14705) #1129(15057-15040) 370 bp 45° C. wbaU Mannosyl transferase 15379-16440 #1130(15379-15396) #1131(15768-15751) 390 bp 60° C. #1132(15850-15867) #1133(16262-16245) 413 bp 50° C. #1134(16027-16044) #1135(16437-16420) 411 bp 60° C. wbaN Rhamnosyl transferase 16441-17385 #1136(16441-16458) #1137(16851-16834) 411 bp 45° C. #1138(16630-16647) #1139(17087-17070) 458 bp 55° C. #1140(16978-16995) #1141(17382-17365) 405 bp 50° C. manC GDP-mannose pathway 17386-18825 #1098(17457-17474) #1099(18143-18126) 687 bp 60° C. mabB GDP-mannose pathway 18812-20245 #1096(18991-19008) #1097(19345-19328) 355 bp 55° C. wbaP Galactosyl transferase 20317-21747 #1142(20389-20406) #1143(20709-20692) 321 bp 55° C.

[0171] TABLE 9 PCR results using LT2 primers* 1094 1092 1090 1088 1112 1114 1116 1118 1100 1120 1122 1124 1126 - - - - - - - - - - - - - Strain O 1095 1093 1091 1089 1113 1115 1117 1119 1101 1121 1123 1125 1127 name group rmlB rmlD rmlA rmlC ddhD ddhA ddhB ddhC abe wzx wzx wzx wbaV M8 A y y y y y y y y P9003 B y y y y y y y y y y y y y M40 C1 M67 C2 y y y y y y y y M18 D1 y y y y y y y y M388 D2 y y y y y y y M344 D3 y y y y y y y y M32 E1 y y y y M324 F y y y y M258 G M252 H M264 I M254 J M255 K M7 L M269 M y y y M270 N M95 O M260 P M273 Q y M261 R M282 S M287 T y y y y M295 U M289 V M296 W M278 X M298 Y M93 Z M291 51 M309 52 M303 53 y y y M292 54 y y y y M304 56 y M293 57 y y y M305 58 M285 59 y y y y M306 60 y M328 61 M330 65 M322 66 M1408 33 M1409 62 M1410 63 y M1413 67 y y y y y y y y y y y y y M74 E4 y y y y 1128 1130 1132 1134 1136 1138 1140 1098 1096 1142 - - - - - - - - - - Strain O 1129 1131 1133 1135 1137 1139 1141 1099 1097 1143 name group wbaV wbaU wbaU wbaU wbaN wbaN wbaN mmC mmB wbap M8 A y y y y y y y y y P9003 B y y y y y y y y y y M40 C1 M67 C2 y y y M18 D1 y y y y y y y y y M388 D2 y y y M344 D3 y y y y y y M32 E1 y y y M324 F M258 G M252 H M264 I M254 J M255 K M7 L M269 M M270 N M95 O M260 P M273 Q M261 R M282 S M287 T M295 U M289 V M296 W M278 X M298 Y M93 Z M291 51 M309 52 M303 53 M292 54 y y y M304 56 M293 57 M305 58 M285 59 M306 60 M328 61 M330 65 M322 66 M1408 33 y M1409 62 M1410 63 M1413 67 y y y y y y y y y y M74 E4 y y y

[0172] TABLE 10 Gene specificities in Salmonella enterica serogroups Genes Serogroup wzy wzx wbaP wbaU wbaN wbaV wbaO wbaW wbaZ wbaQ wbaR A B D B B B D — — — — — B B B B B B B — — — — — D1 B D B B B D — — — — — D2 E1 D B — E1 D E1 — — — — D3 D3 D B B B D — — — — — C2 C2 C2 B — — — — C2 C2 C2 C2 E1 E1 E1 B — E1 — E1

[0173]

1 4 1 14516 DNA Escherichia coli 1 gatctgatgg ccgtagggcg ctacgtgctt tctgctgata tctgggctga gttggaaaaa 60 actgctccag gtgcctgggg acgtattcaa ctgactgatg ctattgcaga gttggctaaa 120 aaacagtctg ttgatgccat gctgatgacc ggcgacagct acgactgcgg taagaagatg 180 ggctatatgc aggcattcgt taagtatggg ctgcgcaacc ttaaagaagg ggcgaagttc 240 cgtaagagca tcaagaagct actgagtgag tagagattta cacgtctttg tgacgataag 300 ccagaaaaaa tagcggcagt taacatccag gcttctatgc tttaagcaat ggaatgttac 360 tgccgttttt tatgaaaaat gaccaataat aacaagttaa cctaccaagt ttaatctgct 420 ttttgttgga ttttttcttg tttctggtcg catttggtaa gacaattagc gtgagtttta 480 gagagttttg cgggatctcg cggaactgct cacatctttg gcatttagtt agtgcactgg 540 tagctgttaa gccaggggcg gtagcttgcc taattaattt ttaacgtata catttattct 600 tgccgcttat agcaaataaa gtcaatcgga ttaaacttct tttccattag gtaaaagagt 660 gtttgtagtc gctcagggaa attggttttg gtagtagtac ttttcaaatt atccattttc 720 cgatttagat ggcagttgat gttactatgc tgcatacata tcaatgtata ttatttactt 780 ttagaatgtg atatgaaaaa aatagtgatc ataggcaatg tagcgtcaat gatgttaagg 840 ttcaggaaag aattaatcat gaatttagtg aggcaaggtg ataatgtata ttgtctagca 900 aatgattttt ccactgaaga tcttaaagta ctttcgtcat ggggcgttaa gggggttaaa 960 ttctctctta actcaaaggg tattaatcct tttaaggata taattgctgt ttatgaacta 1020 aaaaaaattc ttaaggatat ttccccagat attgtatttt catattttgt aaagccagta 1080 atatttggaa ctattgcttc aaagttgtca aaagtgccaa ggattgttgg aatgattgaa 1140 ggtctaggta atgccttcac ttattataag ggaaagcaga ccacaaaaac taaaatgata 1200 aagtggatac aaattctttt atataagtta gcattaccga tgcttgatga tttgattcta 1260 ttaaatcatg atgataaaaa agatttaatc gatcagtata atattaaagc taaggtaaca 1320 gtgttaggtg ggattggatt ggatcttaat gagttttcat ataaagagcc accgaaagag 1380 aaaattacct ttatttttat agcaaggtta ttaagagaga aagggatatt tgagtttatt 1440 gaagccgcaa agttcgttaa gacaacttat ccaagttctg aatttgtaat tttaggaggt 1500 tttgagagta ataatccttt ctcattacaa aaaaatgaaa ttgaatcgct aagaaaagaa 1560 catgatctta tttatcctgg tcatgtggaa aatgttcaag attggttaga gaaaagttct 1620 gtttttgttt tacctacatc atatcgagaa ggcgtaccaa gggtgatcca agaagctatg 1680 gctattggta gacctgtaat aacaactaat gtacctgggt gtagggatat aataaatgat 1740 ggggtcaatg gctttttgat acctccattt gaaattaatt tactggcaga aaaaatgaaa 1800 tattttattg agaataaaga taaagtactc gaaatggggc ttgctggaag gaagtttgca 1860 gaaaaaaact ttgatgcttt tgaaaaaaat aatagactag catcaataat aaaatcaaat 1920 aatgattttt gacttgagca gaaattattt atatttcaat ctgaaaaata aaggctgtta 1980 ttatgaataa agtggcatta attactggta tcactgggca agatggctcc tatttggcag 2040 aattattgtt agaaaaaggt tatgaagttc atggtattaa acgccgtgca tcttcattta 2100 atactgagcg agtggatcac atctatcagg attcacattt agctaatcct aaactttttc 2160 tacactatgg cgatttgaca gatacttcca atctgacccg tattttaaaa gaagttcaac 2220 cagatgaagt ttacaatttg ggggcgatga gccatgtagc ggtatcattt gagtcaccag 2280 aatacactgc tgatgttgat gcgataggaa cattgcgtct tcttgaagct atcaggatat 2340 tggggctgga aaaaaagaca aaattttatc aggcttcaac ttcagagctt tatggtttgg 2400 ttcaagaaat tccacaaaaa gagactacgc cattttatcc acgttcgcct tatgctgttg 2460 caaaattata tgcctattgg atcactgtta attatcgtga gtcttatggt atgtttgcct 2520 gcaatggtat tctctttaac cacgaatcac ctcgccgtgg cgagaccttt gttactcgta 2580 aaataacacg cgggatagca aatattgctc aaggtcttga taaatgctta tacttgggaa 2640 atatggattc tctgcgtgat tggggacatg ctaaggatta tgtcaaaatg caatggatga 2700 tgctgcagca agaaactcca gaagattttg taattgctac aggaattcaa tattctgtcc 2760 gtgagtttgt cacaatggcg gcagagcaag taggcataga gttagcattt gaaggtgagg 2820 gagtaaatga aaaaggtgtt gttgtttcgg tcaatggcac tgatgctaaa gctgtaaacc 2880 cgggcgatgt aattatatct gtagatccaa ggtattttag gcctgcagaa gttgaaacct 2940 tgcttggcga tcctactaat gcgcataaaa aattaggatg gagccctgaa attacattgc 3000 gtgaaatggt aaaagaaatg gtttccagcg atttagcaat agcgaaaaag aacgtcttgc 3060 tgaaagctaa taacattgcc actaatattc cgcaagaata aaaaagataa tacattaaat 3120 aattaaaaat ggtgctagat ttattagtac cattattttt ttttgggtga ctaatgttta 3180 ttacatcaga taaatttaga gaaattatca agttagttcc attagtatca attgatctgc 3240 taattgaaaa cgagaatggt gaatatttat ttggtcttag gaataatcga ccggccaaaa 3300 attatttttt tgttccaggt ggtaggattc gcaaaaatga atctattaaa aatgctttta 3360 aaagaatatc atctatggaa ttaggtaaag agtatggtat ttcaggaagt gtttttaatg 3420 gtgtatggga acatttctat gatgatggtt ttttttctga aggcgaggca acacattata 3480 tagtgctttg ttacacactg aaagttctta aaagtgaatt gaatctccca gatgatcaac 3540 atcgtgaata cctttggcta actaaacacc aaataaatgc taaacaagat gttcataact 3600 attcaaaaaa ttattttttg taatttttat taaaaattaa tatgcgagag aattgtatgt 3660 ctcaatgtct ttaccctgta attattgccg gaggaaccgg aagccgtcta tggccgttgt 3720 ctcgagtatt ataccctaaa caatttttaa atttagttgg ggattctaca atgttgcaaa 3780 caacaattac gcgtttggat ggcatcgaat gcgaaaatcc aattgttatc tgcaatgaag 3840 atcaccgatt tattgtagca gagcaattac gacagattgg taagctaacc aagaatatta 3900 tacttgagcc gaaaggccgt aatactgcac ctgccatagc tttagctgct tttatcgctc 3960 agaagaataa tcctaatgac gaccctttat tattagtact tgcggcagac cactctataa 4020 ataatgaaaa agcatttcga gagtcaataa taaaagctat gccgtatgca acttctggga 4080 agttagtaac atttggaatt attccggaca cggcaaatac tggttatgga tatattaaga 4140 gaagttcttc agctgatcct aataaagaat tcccagcata taatgttgcg gagtttgtag 4200 aaaaaccaga tgttaaaaca gcacaggaat atatttcgag tgggaattat tactggaata 4260 gcggaatgtt tttatttcgc gccagtaaat atcttgatga actacggaaa tttagaccag 4320 atatttatca tagctgtgaa tgtgcaaccg ctacagcaaa tatagatatg gactttgtcc 4380 gaattaacga ggctgagttt attaattgtc ctgaagagtc tatcgattat gctgtgatgg 4440 aaaaaacaaa agacgctgta gttcttccga tagatattgg ctggaatgac gtgggttctt 4500 ggtcatcact ttgggatata agccaaaagg attgccatgg taatgtgtgc catggggatg 4560 tgctcaatca tgatggagaa aatagtttta tttactctga gtcaagtctg gttgcgacag 4620 tcggagtaag taatttagta attgtccaaa ccaaggatgc tgtactggtt gcggaccgtg 4680 ataaagtcca aaatgttaaa aacatagttg acgatctaaa aaagagaaaa cgtgctgaat 4740 actacatgca tcgtgcagtt tttcgccctt ggggtaaatt cgatgcaata gaccaaggcg 4800 atagatatag agtaaaaaaa ataatagtta aaccaggaga agggttagat ttaaggatgc 4860 atcatcatag ggcagagcat tggattgttg tatccggtac tgctaaagtt tcactaggta 4920 gtgaagttaa actattagtt tctaatgagt ctatatatat ccctcaggga gcaaaatata 4980 gtcttgagaa tccaggcgta atacctttgc atctaattga agtaagttct ggtgattacc 5040 ttgaatcaga tgatatagtg cgttttactg acagatataa cagtaaacaa ttcctaaagc 5100 gagattgata aatatgaata aaataacttg cttcaaagca tatgatatac gtgggcgtct 5160 tggtgctgaa ttgaatgatg aaatagcata tagaattggt cgcgcttatg gtgagttttt 5220 taaacctcaa actgtagttg tgggaggaga tgctcgctta acaagtgaga gtttaaagaa 5280 atcactctca aatgggctat gtgatgcagg cgtaaatgtc ttagatcttg gaatgtgtgg 5340 tactgaagag atatattttt ccacttggta tttaggaatt gatggtggaa tcgaggtaac 5400 tgcaagccat aatccaattg attataatgg aatgaaatta gtaaccaaag gtgctcgacc 5460 aatcagcagt gacacaggtc tcaaagatat acaacaatta gtagagagta ataattttga 5520 agagctcaac ctagaaaaaa aagggaatat taccaaatat tccacccgag atgcctacat 5580 aaatcatttg atgggctatg ctaatctgca aaaaataaaa aaaatcaaaa tagttgtgaa 5640 ttctgggaat ggtgcagctg gtcctgttat tgatgctatt gaggaatgct ttttacggaa 5700 caatattccg attcagtttg taaaaataaa taatacaccc gatggtaatt ttccacatgg 5760 tatccctaat ccattactac ctgagtgcag agaagatacc agcagtgcgg ttataagaca 5820 tagtgctgat tttggtattg catttgatgg tgattttgat aggtgttttt tctttgatga 5880 aaatggacaa tttattgaag gatactacat tgttggttta ttagcggaag tttttttagg 5940 gaaatatcca aacgcaaaaa tcattcatga tcctcgcctt atatggaata ctattgatat 6000 cgtagaaagt catggtggta tacctataat gactaaaacc ggtcatgctt acattaagca 6060 aagaatgcgt gaagaggatg ccgtatatgg cggcgaaatg agtgcgcatc attattttaa 6120 agattttgca tactgcgata gtggaatgat tccttggatt ttaatttgtg aacttttgag 6180 tctgacaaat aaaaaattag gtgaactggt ttgtggttgt ataaacgact ggccggcaag 6240 tggagaaata aactgtacac tagacaatcc gcaaaatgaa atagataaat tatttaatcg 6300 ttacaaagat agtgccttag ctgttgatta cactgatgga ttaactatgg agttctctga 6360 ttggcgtttt aatgttagat gctcaaatac agaacctgta gtacgattga atgtagaatc 6420 taggaataat gctattctta tgcaggaaaa aacagaagaa attctgaatt ttatatcaaa 6480 ataaatttgc acctgagttc ataatgggaa caagaaatat atgaaagtac ttctgactgg 6540 ctcaactggc atggttggta agaatatatt agagcatgat agtgcaagta aatataatat 6600 acttactcca accagctctg atttgaattt attagataaa aatgaaatag aaaaattcat 6660 gcttatcaac atgccagact gtattataca tgcagcggga ttagttggag gcattcatgc 6720 aaatataagc aggccgtttg attttctgga aaaaaatttg cagatgggtt taaatttagt 6780 ttccgtcgca aaaaaactag gtatcaagaa agtgcttaac ttgggtagtt catgcatgta 6840 ccccaaaaac tttgaagagg ctattcctga gaaagctctg ttaactggtg agctagaaga 6900 aactaatgag ggatatgcta ttgcgaaaat tgctgtagca aaagcatgcg aatatatatc 6960 aagagaaaac tctaattatt tttataaaac aattatccca tgtaatttat atgggaaata 7020 tgataaattt gatgataact cgtcacatat gattccggca gttataaaaa aaatccatca 7080 tgcgaaaatt aataatgtcc cagagatcga aatttggggg gatggtaatt cgcgccgtga 7140 gtttatgtat gcagaagatt tagctgatct tattttttat gttattccta aaatagaatt 7200 catgcctaat atggtaaatg ctggtttagg ttacgattat tcaattaatg actattataa 7260 gataattgca gaagaaattg gttatactgg gagtttttct catgatttaa caaaaccaac 7320 aggaatgaaa cggaagctag tagatatttc attgcttaat aaaattggtt ggtcaagtca 7380 ctttgaactc agagatggca tcagaaagac ctataattat tacttggaga atcaaaataa 7440 atgattacat acccacttgc tagtaatact tgggatgaat atgagtatgc agcaatacag 7500 tcagtaattg actcaaaaat gtttaccatg ggtaaaaagg ttgagttata tgagaaaaat 7560 tttgctgatt tgtttggtag caaatatgcc gtaatggtta gctctggttc tacagctaat 7620 ctgttaatga ttgctgccct tttcttcact aataaaccaa aacttaaaag aggtgatgaa 7680 ataatagtac ctgcagtgtc atggtctacg acatattacc ctctgcaaca gtatggctta 7740 aaggtgaagt ttgtcgatat caataaagaa actttaaata ttgatatcga tagtttgaaa 7800 aatgctattt cagataaaac aaaagcaata ttgacagtaa atttattagg taatcctaat 7860 gattttgcaa aaataaatga gataataaat aatagggata ttatcttact agaagataac 7920 tgtgagtcga tgggcgcggt ctttcaaaat aagcaggcag gcacattcgg agttatgggt 7980 acctttagtt ctttttactc tcatcatata gctacaatgg aagggggctg cgtagttact 8040 gatgatgaag agctgtatca tgtattgttg tgccttcgag ctcatggttg gacaagaaat 8100 ttaccaaaag agaatatggt tacaggcact aagagtgatg atattttcga agagtcgttt 8160 aagtttgttt taccaggata caatgttcgc ccacttgaaa tgagtggtgc tattgggata 8220 gagcaactta aaaagttacc aggttttata tccaccagac gttccaatgc acaatatttt 8280 gtagataaat ttaaagatca tccattcctt gatatacaaa aagaagttgg tgaaagtagc 8340 tggtttggtt tttccttcgt tataaaggag ggagctgcta ttgagaggaa gagtttagta 8400 aataatctga tctcagcagg cattgaatgc cgaccaattg ttactgggaa ttttctcaaa 8460 aatgaacgtg ttttgagtta ttttgattac tctgtacatg atacggtagc aaatgccgaa 8520 tatatagata agaatggttt ttttgtcgga aaccaccaga tacctttgtt taatgaaata 8580 gattatctac gaaaagtatt aaaataacta acgaggcact ctatttcgaa tagagtgcct 8640 ttaagatggt attaacagtg aaaaaaattt tagcgtttgg ctattctaaa gtactaccac 8700 cggttattga acagtttgtc aatccaattt gcatcttcat tatcacacca ctaatactca 8760 accacctggg taagcaaagc tatggtaatt ggattttatt aattactatt gtatcttttt 8820 ctcagttaat atgtggagga tgttccgcat ggattgcaaa aatcattgca gaacagagaa 8880 ttcttagtga tttatcaaaa aaaaatgctt tacgtcaaat ttcctataat ttttcaattg 8940 ttattatcgc atttgcggta ttgatttctt ttcttatatt aagtatttgt ttcttcgatg 9000 ttgcgaggaa taattcttca ttcttattcg cgattattat ttgtggtttt tttcaggaag 9060 ttgataattt atttagtggt gcgctaaaag gttttgaaaa atttaatgta tcatgttttt 9120 ttgaagtaat tacaagagtg ctctgggctt ctatagtaat atatggcatt tacggaaatg 9180 cactcttata ttttacatgt ttagccttta ccattaaagg tatgctaaaa tatattcttg 9240 tatgtctgaa tattaccggt tgtttcatca atcctaattt taatagagtt gggattgtta 9300 atttgttaaa tgagtcaaaa tggatgtttc ttcaattaac tggtggcgtc tcacttagtt 9360 tgtttgatag gctcgtaata ccattgattt tatctgtcag taaactggct tcttatgtcc 9420 cttgccttca actagctcaa ttgatgttca ctctttctgc gtctgcaaat caaatattac 9480 taccaatgtt tgctagaatg aaagcatcta acacatttcc ctctaattgt ttttttaaaa 9540 ttctgcttgt atcactaatt tctgttttgc cttgtcttgc gttattcttt tttggtcgtg 9600 atatattatc aatatggata aaccctacat ttgcaactga aaattataaa ttaatgcaaa 9660 ttttagctat aagttacatt ttattgtcaa tgatgacatc ttttcatttc ttgttattag 9720 gaattggtaa atctaagctt gttgcaaatt taaatctggt tgcagggctc gcacttgctg 9780 cttcaacgtt aatcgcagct cattatggcc tttatgcaat atctatggta aaaataatat 9840 atccggcttt tcaattttat tacctttatg tagcttttgt ctattttaat agagcgaaaa 9900 atgtctattg atttactttt ttcaattact gaaatcgcaa ttgttttttc ttgcactatt 9960 tacatattta ctcaatgttt gttaatgcgg aggatctatt tagataaaag tattttaatt 10020 cttttatgct tgctcttttt tttagtaatc attcaacttc ctgagcttaa tgtaaacggt 10080 ttggtcgatt ctttaaagtt atcactgcct ttattgatgg tctttatcgc ttttcaaaaa 10140 ccgaaattat gcttgtgggt tattattgca ttgttgtttt tgaactctgc atttaatttt 10200 ttatatttaa agacattcga taagtttagc tcatttcctt ttactttttt tatattgctg 10260 ttttacttgt ttagattggg aattggtaat ttaccggttt ataaaaataa aaaattttac 10320 gcgttgattt ttctctttat attaatagac ataatgcagt cattgttaat aaattatagg 10380 gggcagattt tatattccgt aatttgcatc ctgatacttg tgtttaaagt taatttaaga 10440 aaaaagattc catacttttt tttaatgctg ccagttttat atgtaattat tatggcttat 10500 attggtttta attatttcaa taaaggcgta actttttttg aacctacagc aagtaatatt 10560 gaacgtacgg ggatgatata ttatttggtt tcacagcttg gtgattatat attccatggt 10620 atggggacat taaatttctt aaataacggc ggacaatata agacgttata tggacttcca 10680 tcattaattc ctaatgaccc tcatgatttt ttattacggt tctttataag tattggtgtg 10740 ataggagcat tggtttatca ttctatattt tttgtttttt ttaggagaat atctttctta 10800 ttatatgaga gaaatgctcc tttcattgtt gtaagttgtt tgttactgtt acaagttgtg 10860 ttaatttata cattaaaccc ttttgatgct tttaatcgat tgatttgcgg gcttacagtt 10920 ggagttgttt atggatttgc aaaaattaga taagtatacc tgtaatggaa atttagacgc 10980 tccacttgtt tcaataatca ttgcaactta taattctgaa cttgatatag ctaagtgttt 11040 gcaatcggta actaatcaat cttataagaa tattgaaatc ataataatgg atggaggatc 11100 ttctgataaa acgcttgata ttgcaaaatc gtttaaagac gaccgaataa aaatagtttc 11160 agagaaagat cgtggaattt atgatgcctg gaataaagca gttgatttat ccattggtga 11220 ttgggtagca tttattggtt cagatgatgt ttactatcat acagatgcaa ttgcttcatt 11280 gatgaagggg gttatggtat ctaatggcgc ccctgtggtt tatgggagga cagcgcacga 11340 aggtcccgat aggaacatat ctggattttc aggcagtgaa tggtacaacc taacaggatt 11400 taagtttaat tattacaaat gtaatttacc attgcccatt atgagcgcaa tatattctcg 11460 tgatttcttc agaaacgaac gttttgatat taaattaaaa attgttgctg acgctgattg 11520 gtttctgaga tgtttcatca aatggagtaa agagaagtca ccttatttta ttaatgacac 11580 gacccctatt gttagaatgg gatatggtgg ggtttcgact gatatttctt ctcaagttaa 11640 aactacgcta gaaagtttca ttgtacgcaa aaagaataat atatcctgtt taaacataca 11700 gctgattctt agatatgcta aaattctggt gatggtagcg atcaaaaata tttttggcaa 11760 taatgtttat aaattaatgc ataacgggta tcattcccta aagaaaatca agaataaaat 11820 atgaagattg tttatataat aaccgggctt acttgtggtg gagccgaaca ccttatgacg 11880 cagttagcag accaaatgtt tatacgcggg catgatgtta atattatttg tctaactggt 11940 atatctgagg taaagccaac acaaaatatt aatattcatt atgttaatat ggataaaaat 12000 tttagaagct tttttagagc tttatttcaa gtaaaaaaaa taattgtcgc cttaaagcca 12060 gatataatac atagtcatat gtttcatgct aatattttta gtcgttttat taggatgctg 12120 attccagcgg tgcccctgat atgtaccgca cacaacaaaa atgaaggtgg caatgcaagg 12180 atgttttgtt atcgactgag tgatttttta gcttctatta ctacaaatgt aagtaaagag 12240 gctgttcaag agtttatagc aagaaaggct acacctaaaa ataaaatagt agagattccg 12300 aattttatta atacaaataa atttgatttt gatattaatg tcagaaagaa aacgcgagat 12360 gcttttaatt tgaaagacag tacagcagta ctgctcgcag taggaagact tgttgaagca 12420 aaagactatc cgaacttatt aaatgcaata aatcatttga ttctttcaaa aacatcaaat 12480 tgtaatgatt ttattttgct tattgctggc gatggcgcat taagaaataa attattggat 12540 ttggtttgtc aattgaatct tgtggataaa gttttcttct tggggcaaag aagtgatatt 12600 aaagaattaa tgtgtgctgc agatcttttt gttttgagtt ctgagtggga aggttttggt 12660 ctcgttgttg cagaagctat ggcgtgtgaa cgtcccgttg ttgctaccga ttctggtgga 12720 gttaaagaag tcgttggacc tcataatgat gttatccctg tcagtaatca tattctgttg 12780 gcagagaaaa tcgctgagac acttaaaata gatgataacg caagaaaaat aataggtatg 12840 aaaaatagag aatatattgt ttccaatttt tcaattaaaa cgatagtgag tgagtgggag 12900 cgcttatatt ttaaatattc caagcgtaat aatataattg attgaaaata taagtttgta 12960 ctctggatgc aatagtttct ctatgctgtt tttttactgg ctccgtattt ttacttatag 13020 ctggattttg ttatatatca gtattaatct gtctcaactt catctagact acattcaagc 13080 cgcgcatgcg tcgcgcggtg actacacctg acaggagtat gtaatgtcca agcaacagat 13140 cggcgtcgtc ggtatggcag tgatggggcg caacctggcg ctcaacatcg aaagccgcgg 13200 ttataccgtc tccatcttca accgctcccg cgagaaaact gaagaagttg ttgccgagaa 13260 cccggataag aaactggttc cttattacac ggtgaaagag ttcgtcgagt ctcttgaaac 13320 cccacgtcgt atcctgttaa tggtaaaagc aggggcggga actgatgctg ctatcgattc 13380 cctgaagccg tatctggata aaggcgacat cattattgat ggtggcaaca ccttcttcca 13440 ggacactatc cgtcgtaacc gtgaactgtc cgcggaaggc tttaacttca tcggtaccgg 13500 cgtgtccggc ggtgaagagg gcgccctgaa aggcccatct atcatgccag gtggccagaa 13560 agaagcgtat gagctggttg cgcctatcct gaccaagatt gctgcggttg ctgaagatgg 13620 cgaaccatgt ataacttaca tcggtgctga cggtgcgggt cactacgtga agatggtgca 13680 caacggtatc gaatatggcg atatgcagct gattgctgaa gcctattctc tgcttaaagg 13740 cggccttaat ctgtctaacg aagagctggc aaccactttt accgagtgga atgaaggcga 13800 gctaagtagc tacctgattg acatcaccaa agacatcttc accaaaaaag atgaagaggg 13860 taaatacctg gttgatgtga tcctggacga agctgcgaac aaaggcaccg gtaaatggac 13920 cagccagagc tctctggatc tgggtgaacc gctgtcgctg atcaccgaat ccgtattcgc 13980 tcgctacatc tcttctctga aagaccagcg cattgcggca tctaaagtgc tgtctggtcc 14040 gcaggctaaa ctggctggtg ataaagcaga gttcgttgag aaagtccgtc gcgcgctgta 14100 cctgggtaaa atcgtctctt atgcccaagg cttctctcaa ctgcgtgccg cgtctgacga 14160 atacaactgg gatctgaact acggcgaaat cgcgaagatc ttccgcgcgg gctgcatcat 14220 tcgtgcgcag ttcctgcaga aaattactga cgcgtatgct gaaaacaaag gcattgctaa 14280 cctgttgctg gctccgtact tcaaaaatat cgctgatgaa tatcagcaag cgctgcgtga 14340 tgtagtggct tatgctgtgc agaacggtat tccggtaccg accttctctg cagcggtagc 14400 ctactacgac agctaccgtt ctgcggtact gccggctaat ctgattcagg cacagcgtga 14460 ttacttcggt gcgcacacgt ataaacgcac tgataaagaa ggtgtgttcc acaccg 14516 2 14024 DNA Escherichia coli 2 gtaaccaagg gcggtacgtg cataaatttt aatgcttatc aaaactatta gcattaaaaa 60 tatataagaa attctcaaat gaacaaagaa accgtttcaa taattatgcc cgtttacaat 120 ggggccaaaa ctataatctc atcagtagaa tcaattatac atcaatctta tcaagatttt 180 gttttgtata tcattgacga ttgtagcacc gatgatacat tttcattaat caacagtcga 240 tacaaaaaca atcagaaaat aagaatattg cgtaacaaga caaatttagg tgttgcagaa 300 agtcgaaatt atggaataga aatggccacg gggaaatata tttctttttg tgatgcggat 360 gatttgtggc acgagaaaaa attagagcgt caaatcgaag tgttaaataa tgaatgtgta 420 gatgtggtat gttctaatta ttatgttata gataacaata gaaatattgt tggcgaagtt 480 aatgctcctc atgtgataaa ttatagaaaa atgctcatga aaaactacat agggaatttg 540 acaggaatct ataatgccaa caaattgggt aagttttatc aaaaaaagat tggtcacgag 600 gattatttga tgtggctgga aataattaat aaaacaaatg gtgctatttg tattcaagat 660 aatctggcgt attacatgcg ttcaaataat tcactatcgg gtaataaaat taaagctgca 720 aaatggacat ggagtatata tagagaacat ttacatttgt cctttccaaa aacattatat 780 tattttttat tatatgcttc aaatggagtc atgaaaaaaa taacacattc actattaagg 840 agaaaggaga ctaaaaagtg aagtcagcgg ctaagttgat ttttttattc ctatttacac 900 tttatagtct ccagttgtat ggggttatca tagatgatcg tataacaaat tttgatacaa 960 aggtattaac tagtattata attatatttc agattttttt tgttttatta ttttatctaa 1020 cgattataaa tgaaagaaaa cagcagaaaa aatttatcgt gaactgggag ctaaagttaa 1080 tactcgtttt cctttttgtg actatagaaa ttgctgctgt agttttattt cttaaagaag 1140 gtattcctat atttgatgat gatccagggg gggctaaact tagaatagct gaaggtaatg 1200 gactttacat tagatatatt aagtattttg gtaatatagt tgtgtttgca ttaattattc 1260 tttatgatga gcataaattc aaacagagga ccatcatatt tgtatatttt acaacgattg 1320 ctttatttgg ttatcgttct gaattggtgt tgctcattct tcaatatata ttgattacca 1380 atatcctgtc aaaggataac cgtaatccta aaataaaaag aataataggg tattttttat 1440 tggtaggggt tgtatgctcg ttgttttatc taagtttagg acaagacgga gaacaaaatg 1500 actcatataa taatatgtta aggataatta ataggttaac aatagagcaa gttgaaggtg 1560 ttccatatgt tgtttctgaa tctattaaga acgatttctt tccgacacca gagttagaaa 1620 aggaattaaa agcaataata aatagaatac agggaataaa gcatcaagac ttattttatg 1680 gagaacggtt acataaacaa gtatttggag acatgggagc aaatttttta tcagttacta 1740 cgtatggagc agaactgtta gttttttttg gttttctctg tgtattcatt atccctttag 1800 ggatatatat acctttttat cttttaaaga gaatgaaaaa aacccatagc tcgataaatt 1860 gcgcattcta ttcatatatc attatgattt tattgcaata cttagtggct gggaatgcat 1920 cggccttctt ttttggtcct tttctctccg tattgataat gtgtactcct ctgatcttat 1980 tgcatgatac gttaaagaga ttatcacgaa atgaaaatat cagttataac tgtgacttat 2040 aataatgctg aagggttaga aaaaacttta agtagtttat caattttaaa aataaaacct 2100 tttgagatta ttatagttga tggcggctct acagatggaa cgaatcgtgt cattagtaga 2160 tttactagta tgaatattac acatgtttat gaaaaagatg aagggatata tgatgcgatg 2220 aataagggcc gaatgttggc caaaggcgac ttaatacatt atttaaacgc cggcgatagc 2280 gtaattggag atatatataa aaatatcaaa gagccatgtt tgattaaagt tggccttttc 2340 gaaaatgata aacttctggg attttcttct ataacccatt caaatacagg gtattgtcat 2400 caaggggtga ttttcccaaa gaatcattca gaatatgatc taaggtataa aatatgtgct 2460 gattataagc ttattcaaga ggtgtttcct gaagggttaa gatctctatc tttgattact 2520 tcgggttatg taaaatatga tatgggggga gtatcttcaa aaaaaagaat tttaagagat 2580 aaagagcttg ccaaaattat gtttgaaaaa aataaaaaaa accttattaa gtttattcca 2640 atttcaataa tcaaaatttt attccctgaa cgtttaagaa gagtattgcg gaaaatgcaa 2700 tatatttgtc taactttatt cttcatgaag aatagttcac catatgataa tgaataaaat 2760 caaaaaaata cttaaatttt gcactttaaa aaaatatgat acatcaagtg ctttaggtag 2820 agaacaggaa aggtacagga ttatatcctt gtctgttatt tcaagtttga ttagtaaaat 2880 actctcacta ctttctctta tattaactgt aagtttaact ttaccttatt taggacaaga 2940 gagatttggt gtatggatga ctattaccag tcttggtgct gctctgacat ttttggactt 3000 aggtatagga aatgcattaa caaacaggat cgcacattca tttgcgtgtg gcaaaaattt 3060 aaagatgagt cggcaaatta gtggtgggct cactttgctg gctggattat cgtttgtcat 3120 aactgcaata tgctatatta cttctggcat gattgattgg caactagtaa taaaaggtat 3180 aaacgagaat gtgtatgcag agttacaaca ctcaattaaa gtctttgtaa tcatatttgg 3240 acttggaatt tattcaaatg gtgtgcaaaa agtttatatg ggaatacaaa aagcctatat 3300 aagtaatatt gttaatgcca tatttatatt gttatctatt attactctag taatatcgtc 3360 gaaactacat gcgggactac cagttttaat tgtcagcact cttggtattc aatacatatc 3420 gggaatctat ttaacaatta atcttattat aaagcgatta ataaagttta caaaagttaa 3480 catacatgct aaaagagaag ctccatattt gatattaaac ggttttttct tttttatttt 3540 acagttaggc actctggcaa catggagtgg tgataacttt ataatatcta taacattggg 3600 tgttacttat gttgctgttt ttagcattac acagagatta tttcaaatat ctacggtccc 3660 tcttacgatt tataacatcc cgttatgggc tgcttatgca gatgctcatg cacgcaatga 3720 tactcaattt ataaaaaaga cgctcagaac atcattgaaa atagtgggta tttcatcatt 3780 cttattggcc ttcatattag tagtgttcgg tagtgaagtc gttaatattt ggacagaagg 3840 aaagattcag gtacctcgaa cattcataat agcttatgct ttatggtctg ttattgatgc 3900 tttttcgaat acatttgcaa gctttttaaa tggtttgaac atagttaaac aacaaatgct 3960 tgctgttgta acattgatat tgatcgcaat tccagcaaaa tacatcatag ttagccattt 4020 tgggttaact gttatgttgt actgcttcat ttttatatat attgtaaatt actttatatg 4080 gtataaatgt agttttaaaa aacatatcga tagacagtta aatataagag gatgaaaatg 4140 aaatatatac cagtttacca accgtcattg acaggaaaag aaaaagaata tgtaaatgaa 4200 tgtctggact caacgtggat ttcatcaaaa ggaaactata ttcagaagtt tgaaaataaa 4260 tttgcggaac aaaaccatgt gcaatatgca actactgtaa gtaatggaac ggttgctctt 4320 catttagctt tgttagcgtt aggtatatcg gaaggagatg aagttattgt tccaacactg 4380 acatatatag catcagttaa tgctataaaa tacacaggag ccacccccat tttcgttgat 4440 tcagataatg aaacttggca aatgtctgtt agtgacatag aacaaaaaat cactaataaa 4500 actaaagcta ttatgtgtgt ccatttatac ggacatccat gtgatatgga acaaattgta 4560 gaactggcca aaagtagaaa tttgtttgta attgaagatt gcgctgaagc ctttggttct 4620 aaatataaag gtaaatatgt gggaacattt ggagatattt ctacttttag cttttttgga 4680 aataaaacta ttactacagg tgaaggtgga atggttgtca cgaatgacaa aacactttat 4740 gaccgttgtt tacattttaa aggccaagga ttagctgtac ataggcaata ttggcatgac 4800 gttataggct acaattatag gatgacaaat atctgcgctg ctataggatt agcccagtta 4860 gaacaagctg atgattttat atcacgaaaa cgtgaaattg ctgatattta taaaaaaaat 4920 atcaacagtc ttgtacaagt ccacaaggaa agtaaagatg tttttcacac ttattggatg 4980 gtctcaattc taactaggac cgcagaggaa agagaggaat taaggaatca ccttgcagat 5040 aaactcatcg aaacaaggcc agttttttac cctgtccaca cgatgccaat gtactcggaa 5100 aaatatcaaa agcaccctat agctgaggat cttggttggc gtggaattaa tttacctagt 5160 ttccccagcc tatcgaatga gcaagttatt tatatttgtg aatctattaa cgaattttat 5220 agtgataaat agcctaaaat attgtaaagg tcattcatga aaattgcgtt gaattcagat 5280 ggattttacg agtggggcgg tggaattgat tttattaaat atattctgtc aatattagaa 5340 acgaaaccag aaatatgtat cgatattctt ttaccgagaa atgatataca ttctcttata 5400 agagaaaaag catttccttt taaaagtata ttaaaagcaa ttttaaagag ggaaaggcct 5460 cgatggattt cattaaatag atttaatgag caatactata gagatgcctt tacacaaaat 5520 aatatagaga cgaatcttac ctttattaaa agtaagagct ctgcctttta ttcatatttt 5580 gatagtagcg attgtgatgt tattcttcct tgcatgcgtg ttccttcggg aaatttgaat 5640 aaaaaagcat ggattggtta tatttatgac tttcaacact gttactatcc ttcatttttt 5700 agtaagcgag aaatagatca aaggaatgtg ttttttaaat tgatgctcaa ttgcgctaac 5760 aatattattg ttaatgcaca ttcagttatt accgatgcaa ataaatatgt tgggaattat 5820 tctgcaaaac tacattctct tccatttagt ccatgccctc aattaaaatg gttcgctgat 5880 tactctggta atattgccaa atataatatt gacaaggatt attttataat ttgcaatcaa 5940 ttttggaaac ataaagatca tgcaactgct tttagggcat ttaaaattta tactgaatat 6000 aatcctgatg tttatttagt atgcacggga gctactcaag attatcgatt ccctggatat 6060 tttaatgaat tgatggtttt ggcaaaaaag ctcggaattg aatcgaaaat taagatatta 6120 gggcatatac ctaaacttga acaaattgaa ttaatcaaaa attgcattgc tgtaatacaa 6180 ccaaccttat ttgaaggcgg gcctggaggg ggggtaacat ttgacgctat tgcattaggg 6240 aaaaaagtta tactatctga catagatgtc aataaagaag ttaattgcgg tgatgtatat 6300 ttctttcagg caaaaaacca ttattcatta aatgacgcga tggtaaaagc tgatgaatct 6360 aaaatttttt atgaacctac aactctgata gaattgggtc tcaaaagacg caatgcgtgt 6420 gcagattttc ttttagatgt tgtgaaacaa gaaattgaat cccgatctta atatattcaa 6480 gaggtatata atgactaaag tcgctcttat tacaggtgta actggacaag atggatctta 6540 tctagctgag tttttgcttg ataaagggta tgaagttcat ggtatcaaac gccgagcctc 6600 atcttttaat acagaacgca tagaccatat ttatcaagat ccacatggtt ctaacccaaa 6660 ttttcacttg cactatggag atctgactga ttcatctaac ctcactagaa ttctaaagga 6720 ggtacagcca gatgaagtat ataatttagc tgctatgagt cacgtagcag tttcttttga 6780 gtctccagaa tatacagccg atgtcgatgc aattggtaca ttacgtttac tggaagcaat 6840 tcgcttttta ggattggaaa acaaaacgcg tttctatcaa gcttcaacct cagaattata 6900 tggacttgtt caggaaatcc ctcaaaaaga atccacccct ttttatcctc gttcccctta 6960 tgcagttgca aaactttacg catattggat cacggtaaat tatcgagagt catatggtat 7020 ttatgcatgt aatggtatat tgttcaatca tgaatctcca cgccgtggag aaacgtttgt 7080 aacaaggaaa attactcgag gacttgcaaa tattgcacaa ggcttggaat catgtttgta 7140 tttagggaat atggattcgt tacgagattg gggacatgca aaagattatg ttagaatgca 7200 atggttgatg ttacaacagg agcaacccga agattttgtg attgcaacag gagtccaata 7260 ctcagtccgt cagtttgtcg aaatggcagc agcacaactt ggtattaaga tgagctttgt 7320 tggtaaagga atcgaagaaa aaggcattgt agattcggtt gaaggacagg atgctccagg 7380 tgtgaaacca ggtgatgtca ttgttgctgt tgatcctcgt tatttccgac cagctgaagt 7440 tgatactttg cttggagatc cgagcaaagc taatctcaaa cttggttgga gaccagaaat 7500 tactcttgct gaaatgattt ctgaaatggt tgccaaagat cttgaagccg ctaaaaaaca 7560 ttctctttta aaatcgcatg gtttttctgt aagcttagct ctggaatgat gatgaataag 7620 caacgtattt ttattgctgg tcaccaagga atggttggat cagctattac ccgacgcctc 7680 aaacaacgtg atgatgttga gttggtttta cgtactcggg atgaattgaa cttgttggat 7740 agtagcgctg ttttggattt tttttcttca cagaaaatcg accaggttta tttggcagca 7800 gcaaaagtcg gaggtatttt agctaacagt tcttatcctg ccgattttat atatgagaat 7860 ataatgatag aggcgaatgt cattcatgct gcccacaaaa ataatgtaaa taaactgctt 7920 ttcctcggtt cgtcgtgtat ttatcctaag ttagcacacc aaccgattat ggaagacgaa 7980 ttattacaag ggaaacttga gccaacaaat gaaccttatg ctatcgcaaa aattgcaggt 8040 attaaattat gtgaatctta taaccgtcag tttgggcgtg attaccgttc agtaatgcca 8100 accaatcttt atggtccaaa tgacaatttt catccaagta attctcatgt gattccggcg 8160 cttttgcgcc gctttcatga tgctgtggaa aacaattctc cgaatgttgt tgtttgggga 8220 agtggtactc caaagcgtga attcttacat gtagatgata tggcttctgc aagcatttat 8280 gtcatggaga tgccatacga tatatggcaa aaaaatacta aagtaatgtt gtctcatatc 8340 aatattggaa caggtattga ctgcacgatt tgtgagcttg cggaaacaat agcaaaagtt 8400 gtaggttata aagggcatat tacgttcgat acaacaaagc ccgatggagc ccctcgaaaa 8460 ctacttgatg taacgcttct tcatcaacta ggttggaatc ataaaattac ccttcacaag 8520 ggtcttgaaa atacatacaa ctggtttctt gaaaaccaac ttcaatatcg ggggtaataa 8580 tgtttttaca ttcccaagac tttgccacaa ttgtaaggtc tactcctctt atttctatag 8640 atttgattgt ggaaaacgag tttggcgaaa ttttgctagg aaaacgaatc aaccgcccgg 8700 cacagggcta ttggttcgtt cctggtggta gggtgttgaa agatgaaaaa ttgcagacag 8760 cctttgaacg attgacagaa attgaactag gaattcgttt gcctctctct gtgggtaagt 8820 tttatggtat ctggcagcac ttctacgaag acaatagtat ggggggagac ttttcaacgc 8880 attatatagt tatagcattc cttcttaaat tacaaccaaa cattttgaaa ttaccgaagt 8940 cacaacataa tgcttattgc tggctatcgc gagcaaagct gataaatgat gacgatgtgc 9000 attataattg tcgcgcatat tttaacaata aaacaaatga tgcgattggc ttagataata 9060 aggatataat atgtctgatg cgccaataat tgctgtagtt atggccggtg gtacaggcag 9120 tcgtctttgg ccactttctc gtgaactata tccaaagcag tttttacaac tctctggtga 9180 taacaccttg ttacaaacga ctttgctacg actttcaggc ctatcatgtc aaaaaccatt 9240 agtgataaca aatgaacagc atcgctttgt tgtggctgaa cagttaaggg aaataaataa 9300 attaaatggt aatattattc tagaaccatg cgggcgaaat actgcaccag caatagcgat 9360 atctgcgttt catgcgttaa aacgtaatcc tcaggaagat ccattgcttc tagttcttgc 9420 ggcagaccac gttatagcta aagaaagtgt tttctgtgat gctattaaaa atgcaactcc 9480 catcgctaat caaggtaaaa ttgtaacgtt tggaattata ccagaatatg ctgaaactgg 9540 ttatgggtat attgagagag gtgaactatc tgtaccgctt caagggcatg aaaatactgg 9600 tttttattat gtaaataagt ttgtcgaaaa gcctaatcgt gaaaccgcag aattgtatat 9660 gacttctggt aatcactatt ggaatagtgg aatattcatg tttaaggcat ctgtttatct 9720 tgaggaattg agaaaattta gacctgacat ttacaatgtt tgtgaacagg ttgcctcatc 9780 ctcatacatt gatctagatt ttattcgatt atcaaaagaa caatttcaag attgtcctgc 9840 tgaatctatt gattttgctg taatggaaaa aacagaaaaa tgtgttgtat gccctgttga 9900 tattggttgg agtgacgttg gatcttggca atcgttatgg gacattagtc taaaatcgaa 9960 aacaggagat gtatgtaaag gtgatatatt aacctatgat actaagaata attatatcta 10020 ctctgagtca gcgttggtag ccgccattgg aattgaagat atggttatcg tgcaaactaa 10080 agatgccgtt cttgtgtcta aaaagagtga tgtacagcat gtaaaaaaaa tagtcgaaat 10140 gcttaaattg cagcaacgta cagagtatat tagtcatcgt gaagttttcc gaccatgggg 10200 aaaatttgat tcgattgacc aaggtgagcg atacaaagtc aagaaaatta ttgtgaaacc 10260 tggtgagggg ctttctttaa ggatgcatca ccatcgttct gaacattgga tcgtgctttc 10320 tggtacagca aaagtaaccc ttggcgataa aactaaacta gtcaccgcaa atgaatcgat 10380 atacattccc cttggcgcag cgtatagtct tgagaatccg ggcataatcc ctcttaatct 10440 tattgaagtc agttcagggg attatttggg agaggatgat attataagac agaaagaacg 10500 ttacaaacat gaagattaac atatgaaatc tttaacctgc tttaaagcct atgatattcg 10560 cgggaaatta ggcgaagaac tgaatgaaga tattgcctgg cgcattgggc gtgcctatgg 10620 cgaatttctc aaaccgaaaa ccattgtttt aggcggtgat gtccgcctca ccagcgaagc 10680 gttaaaactg gcgcttgcga aaggtttaca ggatgcgggc gtcgatgtgc tggatatcgg 10740 tatgtccggc accgaagaga tctatttcgc cacgttccat ctcggagtgg atggcggcat 10800 cgaagttacc gccagccata acccgatgga ttacaacggc atgaagctgg tgcgcgaagg 10860 ggctcgcccg atcagcggtg ataccggact gcgcgatgtc cagcgtctgg cagaagccaa 10920 tgacttccct cctgtcgatg aaaccaaacg tggtcgctat cagcaaatca atctgcgtga 10980 cgcttacgtt gatcacctgt tcggttatat caacgtcaaa aacctcacgc cgctcaagct 11040 ggtgatcaac tccgggaacg gcgcagcggg tccggtggtg gacgccattg aagcccgatt 11100 taaagccctc ggcgcaccgg tggaattaat caaagtacac aacacgccgg acggcaattt 11160 ccccaacggt attcctaacc cgctgctgcc ggaatgccgc gacgacaccc gtaatgcggt 11220 catcaaacac ggcgcggata tgggcattgc ctttgatggc gattttgacc gctgtttcct 11280 gtttgacgaa aaagggcagt ttatcgaggg ctactacatt gtcggcctgc tggcagaagc 11340 gttcctcgaa aaaaatcccg gcgcgaagat catccacgat ccacgtctct cctggaacac 11400 cgttgatgtg gtgactgccg caggcggcac cccggtaatg tcgaaaaccg gacacgcctt 11460 tattaaagaa cgtatgcgca aggaagacgc catctacggt ggcgaaatga gcgctcacca 11520 ttacttccgt gatttcgctt actgcgacag cggcatgatc ccgtggctgc tggtcgccga 11580 actggtgtgc ctgaaaggaa aaacgctggg cgaaatggtg cgcgaccgga tggcggcgtt 11640 tccggcaagc ggtgagatca acagcaaact ggcgcaaccc gttgaggcaa ttaatcgcgt 11700 ggaacagcat tttagccgcg aggcgctggc ggtggatcgc accgatggca tcagcatgac 11760 ctttgccgac tggcgcttta acctgcgctc ctccaacacc gaaccggtgg tgcggttgaa 11820 tgtggaatca cgcggtgatg taaagctaat ggaaaagaaa actaaagctc ttcttaaatt 11880 gctaagtgag tgattattta cattaatcat taagcgtatt taagattata ttaaagtaat 11940 gttattgcgg tatatgatga atatgtgggc ttttttatgt ataacgacta taccgcaact 12000 ttatctagga aaagattaat agaaataaag ttttgtactg accaatttgc atttcacgtc 12060 acgattgaga cgttcctttg cttaagacat tttttcatcg cttatgtaat aacaaatgtg 12120 ccttatataa aaaggagaac aaaatggaac ttaaaataat tgagacaata gatttttatt 12180 atccctgttt acgatattat agccaaagtt gtatcctgca tcagtcctgc aatatttcac 12240 gagtgctttg ttaactgaat acatgtctgc cattttccag atgataacga cgtcatcgca 12300 attgatggta aaacacttcg gcacacttat gacaagagtc gtcgcagagg agtggttcat 12360 gtcattagtg cgtttcagca atgcacagtc tggtcctcgg atagatcaag acggatgaga 12420 aacctaatgc gttcacagtt attcatgaac tttctaaaat gatgggtatt aaaggaaaaa 12480 taatcataac tgatgcgatg gcttgccaga aagatattgc agagaagata taaaaacaga 12540 gatgtgatta tttattcgct gtaaaaggaa ataagagtcg gcttaataga gtctttgagg 12600 agatatttac gctgaaagaa ttaaataatc caaaacatga cagttacgca attagtgaaa 12660 agaggcacgg cagagacgat gtccgtcttc atattgtttg agatgctcct gatgagctta 12720 ttgatttcac gtttgaatgg aaagggctgc agaatttatg aatggcagtc cactttctct 12780 caataatagc agagcaaaag aaagaatccg aaatgacgat caaatattat attagatctg 12840 ctgctttaac cgcagagaag ttcgccacag taaatcgaaa tcactggcgc atggagaata 12900 agttgcacag tagcctgatg tggtaatgaa tgaaatcgac tataatataa gaaggcgagt 12960 tgcattcgaa tgattttcta gaatgcggca catcgctatt aatatctgac aatgataatg 13020 tattcaaggc aggattatca tgtaagatgc gaaaagcagt catggacaga aacttcctag 13080 cgtcaggcat tgcagcgtgc gggctttcat aatcttgcat tggttttgat aagatatttc 13140 tttggagatg ggaaaatgaa tttgtatggt atttttggtg ctggaagtta tggtagagaa 13200 acaataccca ttctaaatca acaaataaag caagaatgtg gttctgacta tgctctggtt 13260 tttgtggatg atgttttggc aggaaagaaa gttaatggtt ttgaagtgct ttcaaccaac 13320 tgctttctaa aagcccctta tttaaaaaag tattttaatg ttgctattgc taatgataag 13380 atacgacaga gagtgtctga gtcaatatta ttacacgggg ttgaaccaat aactataaaa 13440 catccaaata gcgttgttta tgatcatact atgataggta gtggcgctat tatttctccc 13500 tttgttacaa tatctactaa tactcatata gggaggtttt ttcatgcaaa catatactca 13560 tacgttgcac atgattgtca aataggagac tatgttacat ttgctcctgg ggctaaatgt 13620 aatggatatg ttgttattga agacaatgca tatataggct cgggtgcagt aattaagcag 13680 ggtgttccta atcgcccact tattattggc gcgggagcca ttataggtat gggggctgtt 13740 gtcactaaaa gtgttcctgc cggtataact gtgtgcggaa atccagcaag agaaatgaaa 13800 agatcgccaa catctattta atgggaatgc gaaaacacgt tccaaatggg actaatgttt 13860 aaaatatata taatttcgct aatttactaa attatggctt ctttttaagc tatcctttac 13920 ttagttatta ctgatacagc atgaaattta taatactctg atacattttt atacgttatt 13980 caagccgcat atctagcggt aacccctgac aggagtaaac aatg 14024 3 12441 DNA Salmonella enterica 3 gttgacaaat accgaccgta taatgaatca aacgttctgg attggtattt atccaggctt 60 gactacagag catttagatt atgtcgtaag taagtttgaa gaattttttg gtttaaattt 120 ctaattttta ggataggatg cttgatgtga ataagaaaat cctaatgact ggcgctacta 180 gctttgtagg tacccatcta ctacatagtc tcataaagga aggttatagt attattgcat 240 taaagcgtcc tataaccgag ccaacgatta tcaatacctt gattgaatgg ttgaatatac 300 aagatataga aaaaatatgt caatcatcta tgaatattca tgcgattgtc catattgcaa 360 cagactatgg tcgaaacaga acccctatat ctgaacaata taaatgtaat gtcctattac 420 caacaagact gcttgagtta atgccagcgc ttaaaacgaa attctttatt tctactgact 480 ctttttttgg gaaatatgag aagcactatg gatatatgcg ttcttacatg gcatctaaaa 540 gacattttgt agaactatca aaaatatacg tagaggaaca tccagacgtt tgttttataa 600 atttacgttt agaacatgtt tacggtgaga gggataaagc aggtaaaata atcccgtatg 660 ttatcaaaaa aatgaaaaac aatgaagata ttgattgtac gatcgccagg cagaaaagag 720 attttattta tatagacgat gttgtttcgg cctatttgaa aattttaaag gagggtttta 780 acgctggaca ctatgatgtc gaggtgggga ctggaaaatc gatagagcta aaagaagtgt 840 ttgagataat aaaaaaagaa acgcatagta gtagtaagat aaattatggt gcagttgcga 900 tgcgtgatga tgagattatg gagtcacatg caaatacctc tttcttgact cgattaggtt 960 ggagtgccga gttttctatt gagaagggtg tgaaaaaaat gttgagtatg aaagagtaat 1020 gaatcgtatt attagaatgt taggtgtaga taaagcaatt cgttatgtta tttttggtaa 1080 gataatatct gtattaacgg gtttactgtt aataatgtta atatcacacc atttatctaa 1140 agacgcacag ggctattatt atacatttaa ttcagtagtg gcactacaga taatatttga 1200 attggggcta tcaacggtaa tcattcaatt cgctagccat gaaatgtcag cgttaaaata 1260 tgattattct gaacgagata ttataggtga aagtaaaaat aagcaacgtt acctatcgtt 1320 atttcggttg gcaataaaat ggtatgcagt aatagctttg ctaataatat taatagtcgg 1380 tcccatcggg tatgtttttt ttacgcaaaa agaaggctta ggtgtacctt ggcaaggggc 1440 atggttatta ttaacaatag ttacagcttt taatattttt cttgtttctg tactttctgt 1500 cgctgaaggg agtgggttaa ttactgatgt gaataaaatg agaatgtatc agtcgctgtt 1560 agctggtata ttggcagtaa gcttacttat tagtggcttt ggactatatg ctacgtctgc 1620 aatagctatt tcagggacta tcatattctc catattttca tataagtatt ttaaaaaaat 1680 tttcctgcaa tctttaaagc ataaaaataa atatactgaa ggtggtattt catgggttaa 1740 tgaaatattt cctatgcaat ggcgaattgc tctaagttgg atgtcagggt attttattta 1800 ttttgttatg acccccattg cattcaaata tttcggggct atatatgcag ggcagttagg 1860 gatgtcttta acattatgca atatggtaat ggctacgggc ctggcttgga tatccactaa 1920 atatccaaaa tggggagtaa tggtttccaa caaacagctt gcggaactga gtaaatcgtt 1980 caaaagtgca gtaatgcaat catccttttt tgtcttgaca ggattaactg gtgtatacat 2040 ttcattatgg ttattgaaat tatctggttc aaacattggc gagcggtttt tgggattgca 2100 ggattttttc tttttatctt tagcaattat tggtaatcac attgtagctt gctttgcaac 2160 ctatataaga gcgcataaaa ctgaaaaaat gacattggca tcatgtataa tggctctctt 2220 gactataact acaatgttgt ttgttgcata tttagagtac tcgaggttct acatgttaat 2280 gtatgcagca ctaacgtggt tatattttgt tcctcaaact tatataatct ttaaaagatt 2340 caagagttct tatgagtaaa aaacctcttc ttactattgc tattccgaca tataaccgct 2400 cttcatgttt ggctcgttta cttgatagta taattcaaca ggagaactat tgtcatgatg 2460 aactcgaggt tattgtttgt gataatgctt caacagatga aacagcaaga atagccaaga 2520 gtggcttaga taaaataaga aatagtactt atcatctaaa tgaagaaaac ttaggaatgg 2580 atggtaactt ccagaaatgt tttgagttat caaatggaaa atatctttgg atgattggcg 2640 atgatgatct aatagtcaaa aatggtattt cgaaggtttt ttcgatatta aagtcccggc 2700 ctgcattaga tatggtgtat gtaaattcag cagcaaagac tgagttaaac tataatgctg 2760 atgtgaggac gtcattctac acaaatgatg tagattttat ttcagacgtg aaagttatgt 2820 tcacgtttat ttctggaatg atatgtaaga aaactgatgc aattgtcaaa gccgttggta 2880 ttttcagtcc gcaaactact ggaaaatatc ttatgcattt aacatggcaa ttgccattac 2940 ttaaacaggg tggagagttc gcagttatcc ataataatat aattgaggct gagccagata 3000 attcaggtgg atatcattta tataaggttt tttctaataa tcttgcgaca atctttgatg 3060 ttttttatcc cagagagcac cgtgtaagta aaagagttcg cgcatcagca tgtttattct 3120 tacttaactt cataggcgat gaagataaaa ccaaaaattt tgctacaaat aattatttaa 3180 gagattgcga tagtgcattt atagatttaa ttatatataa atatgggctt aggtttttct 3240 atctatatcc taaaactgtg cctttattta gaaaaataaa atatattata aagacggttt 3300 taatgcggaa ataaaaatta ttcaagatgg tttgctgaaa acgacttata ggactatcta 3360 atgtttgtct atagtttaag attaaaatta aatcttatca tatcattatt gagtaaagtt 3420 aggcggaaat caaaagcaaa gtttcttgtt ctgcttagcg gatatgattt taaaatggtt 3480 gggaagaatt ttaaattgaa tgtcaaacct tactctgcaa aaaataacac ctcttccaaa 3540 tggggtagta tgcgggttgg tgataactgc tggattgaag ctgtatataa ttatggtgat 3600 gaaaaatttg aaccttattt gtacataggt gatcgtatat gtttaagtga taatgttcat 3660 atttcttgcg tatcatgttt aattttagaa aacgatatat taattggtag caaagtttat 3720 ataggcgatc atagccatgg cagttataaa gtatgcagtc cgaaaataga accgccagca 3780 aataagccat taggtgatat tgctcctatt aaaataggta attgctgctg gattggagat 3840 aatgcagtaa ttctggctgg tagtgaaatt tgtgatggct gtgtaatcgc agctaattca 3900 gtcgtcaagg atttaaaagt cgataagcca tgtttaattg gtggggttcc tgctaaagta 3960 ataaaggtat tttaaaatga atgtttttat cagtatttgt ataccgtctt ataatagagc 4020 tgagttttta gagccactac tggatagcat atataatcaa gattattgtt taaagaataa 4080 tgattttgag gtcattgttt gtgaagataa atctccacag agagatgaga taaactctat 4140 tatcgaaaac tataaagcaa aaaataataa acaaaatctt tatgttaatt tcaatgaaga 4200 taatttaggc tatgataaga atttaaaaaa atgcattagt ttgacgacag gtaaatattg 4260 catgatcatg ggcaacgatg atctattagc agatggagcg ttatcaaaaa tagtgaaagt 4320 tttgaaggct aatcctgaaa ttgtattggc tacgcgagcg tatggttggt ttaaggaaaa 4380 tccgaatgag ttatgtgata ctgttcgtca tttaacagac gatactttat ttcagccggg 4440 ggctgatgcc attaaatttt tcttccgtag agttggagtt atttcaggct ttattgtcaa 4500 tgctgaaaaa gcaaaaaaac tatcgagtga tttatttgat gggcgtttat attatcaaat 4560 gtaccttgct ggtatgctaa tggctgaagg tcagggatac tattttagcg acgtgatgac 4620 attgtcgagg gatacagagg ctcctgactt tggtaacgct ggaactgaaa aaggagtttt 4680 caccccgggg gggtataaac cagagggccg tatacatatg gttgaaggct tgttgctaat 4740 tgcaaaatat atagaagata caacaaaaat tgatggcgtt tatgctggaa ttagaaaaga 4800 cttagcgaac tatttttatc cttatattcg agatcaactc gacttgcctc tttatactta 4860 tattaaaatg ataaataaat ttcggaaaat gggattttca aatgaaaagc ttttctatgt 4920 gcatgccttt ttagggtatg tactaaaacg gaggggctat gatgctttaa ttaaatacat 4980 tcgtagcaaa aaaggcggta ctccgcgtct tggtatttaa cctccacttt caaaaaatgt 5040 tatgaatata cttcttgctg cgatattagg cgttaactta ttttctccat atattagttc 5100 gtggatggtg ggtatgctgc catttccacc aggagcaatc ctaagggatg tactcaatgt 5160 attttttgtg gcgttagtgc tagttcgatt tgtcattgat aggaaaaaaa cttatttccc 5220 gttggttttt actatttttt catggtcggc ggtaatacta tgggtaatag cgttaactat 5280 attctcaccg gataaaattc aagcaattat gggggggcgg agttatattt tattcccggc 5340 agttttcata gcattagtga ttttaaaagt atcatacccg caatccttaa atattgaaaa 5400 aatagtttgc tacataattt ttctaatgtt tatggttgcg acaatatcta ttattgatgt 5460 actaatgaat ggagagttca ttaaattgct cggatatgat gagcattatg caggagaaca 5520 attaaactta attaatagct atgatgggat ggtccgggct acaggcggtt ttagtgatgc 5580 tctcaatttt ggatatatgc tcacattagg tgttttgtta tgtatggagt gtttttccca 5640 aggatataaa agattattga tgcttattat tagttttgtg ctatttatag cgatctgcat 5700 gagtcttact agaggagcaa tacttgttgc tgcgcttatt tacgcacttt atataatttc 5760 aaatcggaag atgctttttt gtggaataac tttatttgta ataattatac ccgttttagc 5820 aatttctact aatatttttg acaactatac agaaattttg atcggcaggt ttacagattc 5880 gtctcaggca tcgcgtggat ctacacaggg gcggatagat atggcaatta attcattaaa 5940 cttcctgtca gaacatccat caggtatagg tctgggtact caaggttcag gaaacatgct 6000 ttcggtaaaa gataataggt taaatacgga taattatttt ttctggatcg cccttgagac 6060 tggtattatt ggcttaatca taaatattat ttatctggca agtcaatttt attcttcaac 6120 tttactaaat agaatatatg gcagtcattg tagcaatatg cactatagat tatattttct 6180 ctttggaagt atatatttta taagtgcagc gttaagttca gcaccttcgt catcaacttt 6240 ttctatatat tattggacag ttttagcttt gattccattt ttaaaattaa caaatagacg 6300 gtgcacgcga taatgaataa taaaaaggtt ttgatggata ttagttggtc taataaaggg 6360 gggattggac gttttactga tgaaatttct aaactactat gtgatatatc taaggaggaa 6420 ctatatagaa aatgtgcttc tccgctggcc ccattaggtt tagcagtcaa tatttttctg 6480 cgaaagaaaa ctgatgtggt ttttcttcct ggctatattc caccactttt ttgttcgaaa 6540 aagttcataa taacaataca tgatctaaat catctggatt taaatgataa ttcctctctt 6600 tttaagaggt tattttataa ttttataata aagcgcggtt gtagaaaagc atataaaata 6660 tttacagttt cgaatttttc aaaagaaaga atagtagcat ggtcaggtgt aaaccctaat 6720 aaaatagtca cggtatataa tggggtatct agtctattta atgccgatgt aaaaccattg 6780 aatttaggct ataaatattt gctatgtgta ggaaacagaa aaactcataa gaatgagaag 6840 tgtgttatat ctgcctttgc caaagcagat attgatccat caataaaact cgtttttact 6900 ggtaatcctt gtaatgattt agaaaaacta ataatacaac atggtttaag tgaacgtgta 6960 aagttctttg ggttcgtgtc tgaaaaagat ttaccatcgt tatataaggg ctcgttagga 7020 ttagttttcc cttctttata tgaaggtttt ggattacctg tagtggaggg catggcctgt 7080 ggtattcctg tattaacttc tctaacttca tcattgccag aggtggctgg agatgcagcg 7140 attcttgtcg accctctttc ggaagatgct attactaaag gaatttcgag gttaattaat 7200 gattctgaac ttcgtaagca tttaatccaa aaggggcttt tgcgggcaaa gaggttcaat 7260 tggcaaaacg tggttagtga gattgaaatg gtactgacag aggcatgtga tggaaataaa 7320 tgaaataaaa atatctctcg ttcatgagtg gttattaagt tatgcaggct ccgaacaggt 7380 atcatctgcc atcctgcatg tttttcctga agcgaagtta tattcggtgg ttgattttct 7440 aacggatgaa caaagaagac attttctggg gaaatatgcg actaccacat ttattcaaaa 7500 tttacctaaa gctaaaaaat tttaccagaa atatttacca ctaatgccac tggctattga 7560 acaacttgat ttatcagatg ctaatatcat cattagtagc gcccattccg ttgcaaaagg 7620 tgttatttcc ggaccagatc agcttcacat tagctatgtt cattctccta ttcgatatgc 7680 gtgggattta cagcatcagt accttaatga gtctaacctg aataaaggaa ttaaaggttg 7740 gttagcaaaa tggcttcttc acaaaatacg aatttgggat tctcgaaccg caaatggggt 7800 tgatcatttt atagctaatt ctcaatatat cgcgcgtaga attaaaaaag tatacagacg 7860 tgaggcttca gttatatatc cgcctgtaga tgtggataat tttgaagtaa aaaatgaaaa 7920 gcaagactat tatttcacag catcccgtat ggtaccctac aaacgtattg atcttattgt 7980 cgaagccttt agtaaaatgc cggaaaagaa attagtagtt attggtgatg gaccggagat 8040 gaaaaaaata aagagcaagg ctacagacaa tataaaattg ctcggttatc aatcttttcc 8100 tgttttaaaa gagtatatgc agagcgccag ggcgtttgtt tttgcagcgg aagaggactt 8160 tggaataata cctgtcgaag ctcaagcttg cggtacccct gttattgcct ttgggaaggg 8220 tggggcctta gaaaccgttc gcccactagg tgtagaggaa ccgactggca ttttcttcaa 8280 ggaacagaat attgcttctt tgcatgaagc tgttagtgaa tttgaaaaaa atgcatcatt 8340 ttttacatct caggcttgta gaaaaaatgc agaaaaattt tctcgatcaa gatttgaaca 8400 agaatttaag aactttgtta atgaaaagtg gaatcttttc aaaacagaac agattattaa 8460 acgttaatta tggtttattg aatgtctaaa ttaataccag taataatggc cggtgggatt 8520 ggtagccgtt tgtggccact ttcacgtgaa gagcatccga aacagttttt aagcgtagat 8580 ggtgaattat ctatgctgca aaacaccatt aaaagattga ctcctctttt ggctggagaa 8640 cctttagtca tttgtaatga tagtcaccgc ttccttgtcg ctgaacaact tcgagctata 8700 aataaactag caaataacat catattagag ccagtggggc gtaatacagc cccagctata 8760 gcgctggccg ctttttgttc acttcagaat gtcgtcgatg aagacccgct tttgcttgtc 8820 cttgctgcgg atcatgtcat ccgcgatgag aaagtgtttc ttaaagctat caatcacgct 8880 gaattttttg caacacaagg taagctagta acgtttggta ttgtacccac acaggccgaa 8940 actggctacg gttatatttg tagaggtgaa gcaatcgggg aagatgcttt ttctgtagcc 9000 gaatttgtag agaagcctga tttcgataca gcgcgtcatt atgtagaatc agagaaatat 9060 tattggaaca gcggtatgtt cctatttcgt gcaagtagtt acttacaaga attaaaggat 9120 ctgtcccccg atatttacca agcatgtgaa aatgcggtag ggagtattaa tcctgatctt 9180 gattttatcc gtattgataa agaagcattc gcaatgtgcc ctagtgattc tatcgattat 9240 gcggtaatgg aacatactag gcatgcagtt gtcgtaccga tgaatgccgg ctggtcagat 9300 gtggggtcat ggtcttcact gtgggatatt tctaagaaag atccacaacg taatgtatta 9360 catggcgata tttttgcata taatagtaaa gataattata tctattctga aaaatcgttt 9420 attagtacaa tcggagtaaa taatttagtt atcgtgcaga cagcagatgc attattagta 9480 tctgataaag attcagtcca ggatgttaaa aaagttgttg attatttaaa agctaataat 9540 agaaacgaac ataaaaaaca tttagaggtt ttccgaccgt ggggaaaatt tagcgtaatt 9600 catagtggcg ataattattt agttaaaaga ataactgtta aaccaggcgc gaagtttgct 9660 gctcagatgc atctccatcg tgctgagcat tggatagtgg tatctggtac tgcttgtatt 9720 actaaggggg aagaaatttt tacaatttcg gagaatgaat caacatttat acctgctaat 9780 acagttcata cgttaaaaaa ccccgcgact attccattag aactaataga aattcaatct 9840 ggcacctatc ttgcggagga tgatattatt cgcctggaga aacattctgg atatctggag 9900 taatgaattg atgaaaaata tatataatac ttacgatgtt atcaacaaat ctggaattaa 9960 ttttggaacc agtggtgccc gcggccttgt taccgatttt acacccgaag tttgcgcacg 10020 atttaccatt tcctttttga cagtaatgca gcaaagattc tcatttacaa cggttgcgct 10080 cgcaattgat aatcgtccaa gcagttacgc gatggctcaa gcttgtgccg ctgctttgca 10140 agaaaaagga attaaaaccg tttactatgg cgtaattcca acacctgctt tagctcatca 10200 atcaatttcc gataaagtac ctgcaatcat ggttactggc agtcatatcc cttttgaccg 10260 taatggcctg aaattttata gaccagatgg tgaaattact aaagatgatg agaatgctat 10320 tattcatgtt gatgcctcat ttatgcagcc taagcttgaa caattgacaa tttccacaat 10380 cgctgctaga aattatattc tacgatatac ctcattattt ccaatgccat tcttgaaaaa 10440 taagcgcatt ggaatttatg agcattctag tgcgggtcgt gatctctata agacgttatt 10500 caaaatgttg ggtgctacag ttgttagttt agcaaggagc gacgaatttg ttcctattga 10560 tactgaagct gtaagtgaag atgatagaaa taaagcaatc acatgggcaa aaaaatatca 10620 gttagatgct atattttcaa ctgatggtga tggagatcgc cctctgatag ctgacgaata 10680 tggaaattgg ttaagaggag atatattagg ccttctgtgc tctctcgaat tagctgctga 10740 tgcagtcgct attcctgtaa gctgcaacag tacaatctca tctggtaact tttttaaaca 10800 tgtggaacga acaaagattg gttcacccta tgtgattgca gcatttgcta aattatctgc 10860 aaactataat tgtatagctg gttttgaagc gaatggtggc tttctgctag gtagcgatgt 10920 ttatattaat cagcgtttac ttaaggcatt accaacacgt gatgctttat tacctgccat 10980 tatgcttctg tttggtagca aggacaaaag tattagtgag cttgttaaaa aacttcctgc 11040 tcgctatacc tattcaaaca gattacagga tataagtgtt aaaacaagta tgtctttaat 11100 aaatcttggt ctgacagatc aagaggattt tttgcagtat attggtttta ataaacatca 11160 tatattacat tctgatgtta ctgatggctt tagaatcact atcgataaca acaatattat 11220 tcatttacga ccttcaggca atgcccctga gttgcgttgc tatgcggagg ctgactcgca 11280 agaggatgca tgtaatattg ttgaaactgt tctctctaat atcaaaagca aactgggtag 11340 agcttaatgc tgttgataat agagcgtttc tttccagtaa tactttgtct ggttatctgg 11400 tacccaagtt gagggtgaga attaaatgga tcgttttgat aataagtata acccaaattt 11460 atgcaaaata ttattggcta tatcagattt actgtttttt aatgtagcct tatgggcatc 11520 gttaggagtt gtatatttaa tctttgatga agttcagcga tttgtaccac aagagcaatt 11580 agataatcga tttatatcac attttattct atctatagta tgcgttggat ggttttgggt 11640 tcgactgcgt cactatacat atcgaaagcc attctggtat gagttgaaag aggttattcg 11700 tactatcgtt atttttgctg tgtttgattt ggctttaatt gcgtttacaa aatggcagtt 11760 ttcacgctat gtctgggtgt tttgttggac ttttgccata atcctggtgc ctttttttcg 11820 cgcacttaca aagcatttat tgaacaagct aggtatctgg aagaaaaaaa ctatcatcct 11880 tgggagcgga cagaatgctc gtggtgcata ttctgcgctg caaagtgagg agatgatggg 11940 gtttgatgtt atcgcttttt ttgatacgga tgcgtcagat gctgaaataa atatgttgcc 12000 ggtgataaag gacactgaga ctatttggga tttaaatcgt acaggtgatg tccattatat 12060 ccttgcttat gaatacaccg agttggagaa aacacatttt tggctacgtg aactttcaaa 12120 acatcattgt cgttctgtta ctgtcgtccc ctcgtttaga ggattgccat tatataatac 12180 tgatatgtct tttatcttta gccatgaagt tatgttatta aggatacaaa ataacttggc 12240 taaaaggtcg tcccgttttc tcaaacggac atttgatatt gtttgttcaa taatgattct 12300 tataattgca tcaccactta tgatttatct gtggtataaa gttactcgag atggtggtcc 12360 ggctatttat ggtcaccagc gagtaggtcg gcatggaaaa ctttttccat gctacaaatt 12420 tcgttctatg gttatgaatt c 12441 4 22080 DNA Salmonella enterica 4 gaattcggga ggcgcaatga aagtcagctt ttttctgctg aaatttccac tctcatcgga 60 aacctttgtg ctgaatcaga ttactgcgtt tattgatatg ggccatgagg tggagattgt 120 cgcgttacaa aaaggcgata cccaacatac tcacgccgcc tgggagaagt atggcctggc 180 ggcgaaaacc cgctggttac aggatgagcc ccagggacgg ctggcgaaac tgcgctaccg 240 ggcatgtaaa acgctgccgg ggctgcatcg ggcggcgacc tggaaagcgc tcaattttac 300 ccgctatggc gatgaatcac gcaatttgat cctttccgcg atttgcgcgc aggtgagcca 360 gccttttgtg gcggatgtgt ttatcgcaca ctttggtccg gcgggcgtga cggcggccaa 420 actacgcgaa ctgggcgtgc ttcgcggcaa aatcgcgact attttccacg ggattgatat 480 ctctagtcgt gaggtgctca gtcattacac gccggagtat cagcagttgt ttcgtcgtgg 540 cgatctgatg ctgcccatca gcgatctgtg ggccggtcgc ctgaaaagta tgggctgtcc 600 gccggaaaag attgccgttt cgcgcatggg cgtcgacatg acgcgtttta cccatcgttc 660 ggtgaaagcg ccagggatgc cgctggagat gatttccgtc gcgcgcctga cagaaaaaaa 720 aggcctgcat gtggcgattg aagcctgtcg gcaactgaaa gcacagggcg tggcgtttcg 780 ctaccgcatt ctggggattg gcccgtggga acgtcggctg cgcacgctca tcgagcagta 840 tcagctagag gatgtcattg agatgccggg gtttaaaccg agccatgaag tgaaggcgat 900 gctggatgac gccgatgttt ttttgctgcc gtcgattacc ggtacggatg gcgatatgga 960 aggtattccg gtagcgctga tggaggcgat ggcggtaggg attcccgtgg tatctaccgt 1020 gcatagcggt attccggaac tggtggaggc cggcaaatcc ggctggctgg tgccggaaaa 1080 cgatgcgcag gcgctggcgg cccgactcgc tgagttcagc cggattgacc acgacacgct 1140 ggagtcggtg atcacgcgcg cccgtgaaaa agtggcgcaa gattttaatc agcaggcgat 1200 taatcgccag ttagccagcc tgctacaaac gatataaacg aggtggtatg cccgcgacta 1260 aattctcccg acgtaccctc ctgacggcag gttctgcgct tgctgttctt ccttttctgc 1320 gcgccttgcc ggtacaggcg cgtgaacctc gcgagaccgt cgatattaag gattatccgg 1380 cggatgacgg tatcgcctcg ttcaaacagg ccttcgccga cggacagacc gtggtcgtac 1440 cgccaggatg ggtgtgtgaa aatatcaatg cggcgataac gattccggcg ggaaaaacgc 1500 tgcgggtaca gggcgcggtg cgtgggaatg gccggggacg gtttattttg caggacgggt 1560 gtcaggtggt gggggagcag ggcggcagtc tgcacaatgt gacgctggat gttcgcgggt 1620 cggactgtgt gattaaaggc gtggcgatga gcggctttgg ccccgtcgcg caaattttca 1680 tcggtggtaa ggaaccgcag gtgatgcgta atctcattat cgatgacatc accgttaccc 1740 acgccaacta cgccattctc cgccagggat ttcataacca aatggatggc gcgcggatta 1800 cgcatagccg ctttagcgat ttacaggggg acgccattga gtggaatgtc gcgattcacg 1860 accgcgacat cctgatttcc gatcatgtca tcgaacgcat taattgtacc aatggcaaaa 1920 tcaactgggg gatcggcatc gggctggcgg gtagcaccta tgacaacagt tatcctgaag 1980 accaggcagt aaaaaacttt gtggtggcca atattaccgg atctgattgc cgacagcttg 2040 tgcacgtaga aaatggcaaa catttcgtca ttcgcaatgt caaagccaaa aacatcacgc 2100 ccggtttcag taaaaatgcg ggtattgata acgcaacgat cgcaatttat ggctgtgata 2160 atttcgtcat tgataatatt gatatgacga atagtgccgg gatgctcatc ggctatggcg 2220 tcgttaaagg aaaatacctg tcaattccgc aaaactttaa attaaacgct attcggttgg 2280 ataatcgcca ggttgcttat aaattacgcg gcattcaaat ttcctccggc aacaccccct 2340 cttttgtcgc catcaccaat gtacggatga cgcgtgctac gctggaactg cataatcaac 2400 cgcagcacct ctttctgcgc aatatcaacg tgatgcaaac ttcagcgatt ggcccggcgt 2460 taaaaatgca tttcgatttg cgtaaagatg tacgtggtca atttatggcc cgccaggaca 2520 cgctgctttc cctcgctaat gttcatgcca tcaatgaaaa cgggcagagt tccgtggata 2580 tcgacaggat taatcaccaa accgtgaatg tcgaagcagt gaatttttcg ctgccgaagc 2640 ggggagggta agtaccgcta tttttacgaa aattcctggg aaaaagttgt tcatacttaa 2700 tgttatggtg ccgactaaga cgtaatgtag agcgtgccat cattatccct ggcagcagag 2760 taattcatgc tggcgaaaac aagctaaaga gctataattc agcaaccatt ttacaggtgg 2820 aagaaacaat gatgaatttg aaagcagtta taccggtagc gggtttgggt atgcatatgt 2880 tgcctgccac caaggcaatc ccaaaagaga tgctaccgat cgtcgacaag ccaatgattc 2940 agtacattgt cgatgagatt gtggctgcag ggatcaaaga aatcgtgctg gtgactcacg 3000 cgtctaaaaa cgccgttgag aaccacttcg acacctctta tgaacttgaa tcacttcttg 3060 agcagcgcgt taagcgtcag cttttggcgg aagtgcaatc tatctgccca ccgggcgtga 3120 cgattatgaa cgttcgccag gcgcagccgt tagggctggg gcattctatt ctgtgcgcgc 3180 gtccggtcgt gggcgataac cctttcattg tggtactccc ggatattatt atcgatgatg 3240 ctaccgccga tccgctgcgc tataaccttg cggcgatggt ggcgcgtttc aatgaaacgg 3300 gtcgcagcca ggtgctggcg aagcgcatga aaggtgattt atcggagtat tccgttatcc 3360 agacgaaaga acctctggat aatgaaggca aagtcagccg gattgtggag tttatcgaaa 3420 aaccggatca gccgcagacg ctggattccg atttgatggc ggtaggccgt tatgtgcttt 3480 cagccgacat ctgggcggaa ctggaaagaa ccgaaccggg cgcctggggc cgcatccagc 3540 tcaccgatgc cattgctgaa ctggcgaaaa aacagtcggt tgacgcgatg ctaatgacgg 3600 gtgacagcta tgactgcggt aaaaaaatgg gctacatgca ggcatttgtg aagtacgggc 3660 tgcgcaacct gaaagaagga gccaagttcc gtaagagcat agagcagctt ttgcatgaat 3720 aagtattaac aaccgtgata aatggttggt gataaacata ataacggcag tgaacattcg 3780 aagcggcaag ttggctgaaa cgagtgttga ctgccgtttt agttttgtat aaagggctta 3840 agtaacaagg ggttatctgg agcattttaa tgctgatttt ataagattaa tccttgtttc 3900 cggatgcaat taataagaca attagcgttt aagttttagt gagctttgcc ctgctgggcg 3960 aggtttgcaa caagtcgata tgtacgcagt gcactggtag ctgatgagcc aggggcggta 4020 gcgtgtgtaa cgacttgagc aattaatttt tattggcaaa ttaaatacca cattaaatac 4080 gccttatgga atagaaaagt gaagatactt attactggcg gggcaggttt tattggatca 4140 gctgttgtcc gccatattat taagaataca caggacactg tagttaatat tgataaatta 4200 acctacgccg gtaatcttga atccctttct gatatttctg aaagtaatcg ctacaatttt 4260 gaacacgcgg atatttgtga ttccgctgaa ataacgcgta tttttgagca gtaccagccg 4320 gacgcggtga tgcatttggc tgcggaaagt catgtggacc gttcgattac cgggccagca 4380 gcatttattg aaaccaatat cgtcggcacc tatgcacttc ttgaagttgc gcgtaaatac 4440 tggtctgccc ttggcgaaga taaaaaaaat aattttcgtt ttcatcatat ttccactgat 4500 gaagtttacg gcgatttacc gcatcctgat gaagttgaaa acagcgttac gctgccgtta 4560 tttactgaaa cgacggcata tgcgccaagt agcccctatt ctgcgtcaaa agcatccagc 4620 gatcatttag tccgtgcctg gcggcgtacc tatggtctac caacgatcgt taccaattgt 4680 tctaataact atggccctta tcacttccct gaaaaactga ttccgttggt cattttgaac 4740 gcactggaag gaaagccttt gccaatttat ggcaaagggg atcagattcg cgattggcta 4800 tatgtagaag atcatgctcg cgcgcttcat atggtagtga ctgaaggcaa ggcaggggag 4860 acttataaca ttggtggaca caatgagaag aaaaatctcg atgtggtatt taccatctgt 4920 gatctgctgg atgagattgt acccaaagcg acttcttatc gtgaacaaat cacttatgtc 4980 gcggatcgtc cgggccatga tcgtcgttat gccattgatg caggtaaaat tagccgcgaa 5040 ttaggctgga aaccgctgga gacctttgaa agcggtattc gtaaaacagt ggaatggtac 5100 cttgcaaata ctcaatgggt aaacaatgtt aaaagtgggg cgtatcagag ttggatagaa 5160 cagaactatg aaggacgcca gtaatgaata tcttactttt tggtaagaca gggcaagtag 5220 gctgggagtt gcaacgttct ctggcaccgg tagggaatct gattgccctg gatgtccatt 5280 caaaagagtt ttgcggtgat tttagtaatc cgaaaggcgt tgccgaaacc gttcgtaagc 5340 ttcgtcccga tgtgattgtt aacgcagcag cccatactgc agtagataaa gcagagtctg 5400 aaccagaact ggcgcagtta cttaacgcca ccagtgtgga agccatcgct aaagcagcca 5460 acgaaactgg cgcatgggta gtgcattatt caaccgatta tgtatttcct ggtaccggcg 5520 atatcccatg gcaggaaacg gacgctacgt cgccgctgaa tgtctatggc aaaaccaaac 5580 tggcgggaga aaaggccctg caggataact gccctaaaca ccttatcttc cgcaccagtt 5640 gggtttatgc aggtaagggc aataatttcg caaagacaat gcttcgtctg gcgaaagagc 5700 gtcagacact ttcagtcatt aacgatcagt acggtgcgcc aaccggtgcg gaattactgg 5760 ctgactgtac ggcgcatgcg atccgtgtgg cgttaaataa accagaagtc gcaggtcttt 5820 accatctggt tgccggggga accacaacct ggcatgacta cgcggcctta gtctttgacg 5880 aggcgcgcaa agcagggata acgcttgcgc tgactgagct taatgctgtg ccgaccagcg 5940 cctacccgac gccggcgagc agaccaggca attcgcgtct caatactgaa aagtttcagc 6000 gtaattttga ccttattctg cctcaatggg aattaggagt taagcgtatg ctgactgaaa 6060 tgtttacgac gacaaccatc taataaattt aaatgcccat cagggcattt tctatgaatg 6120 agaaatggaa atgaaaacgc gtaagggcat tattttagcg gggggctccg gcacccgtct 6180 ttatccggtg accatggcgg taagtaagca attgctacca atttatgata aaccgatgat 6240 ttactatccc ctttccacgc ttatgctggc aggcattcgg gatatcctga tcatcagtac 6300 gccacaggac acgccgcgtt ttcaacaact gctgggagac ggcagccagt gggggctgaa 6360 tcttcaatat aaagtacagc caagcccgga tggcttagca caggcgttta ttattggtga 6420 agagttcatt ggtcatgatg attgtgcatt agtgctgggt gacaatatct tctatggtca 6480 tgatttacca aagttaatgg aagctgccgt taataaagaa agtggtgcta ccgtcttcgc 6540 ttatcatgta aacgatccgg agcgctacgg tgtggttgag tttgaccaaa agggcacagc 6600 cgttagtctg gaagaaaaac cattacaacc gaagagtaat tacgcggtaa cggggctgta 6660 tttttatgat aatagcgtgg tggagatggc gaaaaatctt aagccttccg ctcgcggtga 6720 gttagaaatc acggatatta accgtatcta tatggagcag ggaagattgt ctgtcgctat 6780 gatggggcgc ggttatgcct ggctggatac agggacgcat cagagtttga tagaggccag 6840 taattttatt gcaaccatcg aagaacgcca ggggctaaaa gtgtcctgcc cggaagagat 6900 cgcatttcgt aaaaatttta taaatgcaca acaggttata gaactggccg ggccattatc 6960 aaaaaatgat tatggcaaat atttgctgaa gatggtgaaa ggtttataag tgatgattgt 7020 gattaaaaca gcaataccag atgtcttgat cttagagcct aaagtttttg gcgatgagag 7080 gggattcttt tttgaaagtt ataaccagca gacctttgaa gagttgattg gacgtaaagt 7140 tacatttgtt caagataatc attcaaaatc caaaaagaac gtactcagag ggctacattt 7200 tcagagagga gaaaatgcac aggggaagtt agttcgttgt gctgtcggtg aggtttttga 7260 tgttgcggtc gatatccgaa aagaatcgcc tacttttggt caatgggttg gtgtaaatct 7320 gtctgctgag aataagcgac agctttggat tccagaaggt tttgctcatg gttttgttac 7380 tcttagtgag tatgcagagt ttctgtacaa agcaactaat tattactcac cttcatcgga 7440 aggtagcatt ctatggaatg atgaggcaat aggtattgaa tggccttttt ctcagctgcc 7500 tgagctttca gcaaaagatg ctgcagcacc tttactggat caagccttgt taacagagta 7560 agcatcgtgt ctcatattat taagattttt ccatcaaata ttgaattttc cggtagagag 7620 gatgaatcaa tcctcgatgc tgcgctatcg gctggtatcc atcttgaaca tagctgcaaa 7680 gcgggtgatt gtggtatctg tgagtccgat ttgttggcgg gagaagttgt tgactccaaa 7740 ggtaatattt ttggacaggg tgataaaata ctaacctgct gctgtaaacc taaaaccgcc 7800 cttgagctaa atgcgcattt ttttcctgaa ctagctggac agacaaaaaa aattgtccca 7860 tgcaaggtaa atagtgctgt actggtttca ggcgatgtta tgactttgaa gttacgcaca 7920 ccaccaacag caaaaattgg cttccttcca gggcagtata tcaatttaca ttataaaggt 7980 gtaactcgca gttattctat cgctaatagt gatgagtcga atggtattga gttgcatgta 8040 aggaatgttc ccaatggtca gatgagttcg ctcatttttg gggagttaca agaaaatact 8100 cttatgcgca ttgaagggcc ttgcggaaca ttttttattc gtgaaagtga cagacctata 8160 atcttccttg caggcggtac tggattcgct ccagttaaat caatggttga gcatctcatt 8220 cagggaaaat gtcgtcgtga gatctacatt tactggggaa tgcaatatag taaagatttt 8280 tactctgcat taccgcagca gtggagtgaa cagcacgaca acgttcatta tatccctgtt 8340 gtttctggtg atgacgccga atggggggga agaaagggat ttgtccatca tgccgtgatg 8400 gatgattttg attctctaga gttcttcgat atatatgcat gtggttcacc tgtgatgatc 8460 gatgccagta aaaaggactt tatgatgaaa aatctctctg tagaacattt ctattctgat 8520 gcatttaccg catctaataa tattgaggat aatttatgaa agcggtcatc ctggctggtg 8580 gacttggtac cagactaagt gaagaaacaa ttgtaaaacc aaaaccgatg gtagaaattg 8640 gtggcaagcc tattctttgg cacattatga aaatgtattc tgtgcatggt atcaaggatt 8700 ttattatctg ctgtggttat aaaggatatg tgattaaaga atattttgcg aactacttcc 8760 ttcacatgtc agatgtaaca ttccatatgg ctgaaaaccg tatggaagtt caccataaac 8820 gtgttgaacc atggaatgtc acattggttg atacgggtga ttcttcaatg actggtggtc 8880 gtctgaaacg tgttgctgaa tacgtaaaag atgacgaggc tttcctgttt acttatggtg 8940 atggcgttgc cgaccttgat atcaaagcga ctatcgattt ccataaggct cacggtaaga 9000 aagcgacttt aacagctact tttccaccag gacgctttgg cgcattagat atccgagctg 9060 gtcaggtccg gtcattccag gaaaaaccga aaggcgatgg ggcaatgatc aatggtggtt 9120 tctttgtgtt gaatccatcg gttatcgatc tcatcgataa cgatgcaaca acctgggaac 9180 aagagccatt aatgacattg gcacaacagg gggagttaat ggcttttgaa cacccaggtt 9240 tctggcagcc gatggatacc ctacgtgata aagtttacct cgaagggctg tgggaaaaag 9300 gtaaagctcc gtggaaaacc tgggagtaac tagatgattg ataaaaattt ttggcaaggt 9360 aaacgtgtat tcgttaccgg ccatactggc tttaaaggaa gctggctttc gctatggctg 9420 actgaaatgg gtgcaattgt aaaaggctat gcacttgatg cgccaactgt tccaagttta 9480 tttgagatag tgcgtcttaa tgatcttatg gaatctcata ttggcgacat tcgtgatttt 9540 gaaaagctgc gcaattctat tgcagaattt aagccagaaa ttgttttcca tatggcagcc 9600 cagcctttag tgcgcctatc ttatgaacag ccaatcgaaa catactcaac aaatgttatg 9660 ggtactgtcc atttgcttga aacagttaag caagtaggta acataaaggc agtcgtaaat 9720 atcaccagtg ataagtgcta cgacaatcgt gagtgggtgt ggggctatcg tgagaacgaa 9780 cccatgggag ggtacgatcc atactctaat agtaaaggtt gtgcagaatt agtcgcgtct 9840 gcattccgga actcattctt caatcctgca aattatgagc aacatggcgt tggtttggcg 9900 tctgtgaggg ctggtaatgt cataggcgga ggcgattggg ctaaagaccg tttaattccc 9960 gatattctgc gctcatttga aaataaccag caggttatta ttcgaaaccc atattctatc 10020 cgtccctggc agcatgtact ggagcctctt tctggttaca ttgtggtggc gcaacgctta 10080 tatacagaag gtgctaagtt ttctgaagga tggaatttcg gcccgcgtga tgaagatgcg 10140 aagacggtcg aatttattgt tgacaagatg gtcacgcttt ggggtgatga tgcaagctgg 10200 ttactggatg gtgagaatca tcctcatgag gcacattacc tgaaactgga ttgctctaaa 10260 gcaaatatgc aattaggatg gcatccgcgt tggggattga ctgaaacact tggtcgcatc 10320 gtaaaatggc ataaagcatg gattcgcggc gaagatatgt tgatttgttc aaagcgtgaa 10380 atcagcgact atatgtctgc aactactcgt taagaaaata agtttaagga atcaaagtaa 10440 tgacagcaaa taacctgcgt gagcaaatct ctcagcttgt cgctcagtat gcgaatgagg 10500 cattgagccc gaaacctttt gttgcaggta caagcgttgt gcctccttcc gggaaggtta 10560 ttggtgccaa agagttacaa ttgatggttg aggcgtctct tgatggatgg ctaactactg 10620 gtcgtttcaa tgatgccttt gaaaaaaaac ttggggaatt tattggggtt cctcatgttt 10680 taacgacaac atctggctct tcggcaaact tgctggcact gactgcgctg acttccccaa 10740 aattaggcga gcgagctctc aaacctggtg atgaggttat tactgtcgct gctggcttcc 10800 cgactacagt taacccggcg atccagaatg gtttaatacc ggtattcgtg gatgttgata 10860 tcccgacata taatatcgat gcctctctca ttgaagctgc agttactgag aaatcaaaag 10920 cgataatgat cgctcataca ctcggtaatg catttaacct gagtgaagtt cgtcggattg 10980 ccgataaata taacttatgg ttgattgaag actgctgtga tgcccttggg acgacttatg 11040 aaggccagat ggtaggtacc tttggtgaca tcggaaccgt tagtttttat ccggctcacc 11100 atatcacaat gggtgaaggc ggtgctgtat tcaccaagtc aggtgaactg aagaaaatta 11160 ttgagtcgtt ccgtgactgg ggccgggatt gttattgtgc gccaggatgc gataacacct 11220 gcggtaaacg ttttggtcag caattgggat cacttcctca aggctatgat cacaaatata 11280 cttattccca cctcggatat aatctcaaaa tcacggacat gcaggcagca tgtggtctgg 11340 ctcagttgga gcgcgtagaa gagtttgtag agcagcgtaa agctaacttt tcctatctga 11400 aacagggctt gcaatcttgc actgaattcc tcgaattacc agaagcaaca gagaaatcag 11460 atccatcctg gtttggcttc cctatcaccc tgaaagaaac tagcggtgtt aaccgtgtcg 11520 aactggtgaa attccttgat gaagcaaaaa tcggtacacg tttactgttt gctggaaatc 11580 tgattcgcca accgtatttt gctaatgtga aatatcgtgt agtgggtgag ttgacaaata 11640 ccgaccgtat aatgaatcaa acgttctgga ttggtattta tccaggcttg actacagagc 11700 atttagatta tgtagttagc aagtttgaag agttctttgg tttgaatttc taattcaatt 11760 tattctatct ggtgattgcg atgacctttt tgaaagaata tgtaattgtc agtggggctt 11820 ccggctttat tggtaagcat ttactcgaag cgctaaaaaa atcggggatt tcagttgtcg 11880 caatcactcg agatgtaata aaaaataata gtaatgcatt agctaatgtt agatggtgca 11940 gttgggataa tatcgaatta ttagtcgagg agttatcaat tgattctgca ttaattggta 12000 tcattcattt ggcaacagaa tatgggcata aaacatcatc tctcataaat attgaagatg 12060 caaatgttat aaaaccatta aagcttcttg atttggcaat aaaatatcgg gcggatatct 12120 ttttaaatac agatagtttt tttgccaaga aagattttaa ttatcaacat atgcggcctt 12180 atataattac taaaagacac tttgatgaaa ttgggcatta ttatgctaat atgcatgaca 12240 tttcatttgt aaacatgcga ttagagcatg tatatgggcc tggggatggt gaaaataaat 12300 ttattccata cattatcgac tgcttaaata aaaaacagag ttgcgtgaaa tgtacaacag 12360 gcgaacagat aagagacttt atttttgtag atgatgtggt aaatgcttat ttaactatat 12420 tagaaaatag aaaagaagta ccttcatata ctgagtatca agttggaact ggtgctgggg 12480 taagtttgaa agattttctg gtttatttgc aaaatactat gatgccaggt tcatcgagta 12540 tatttgaatt tggtgcgata gagcaaagag ataatgaaat aatgttctct gtagcaaata 12600 ataaaaattt aaaagcaatg ggctggaaac caaatttcga ttataaaaaa ggaattgaag 12660 aactactgaa acggttatga gattttcatg atcttttaat aaataaatcg ttaacaaatt 12720 agtcgcgtta tgttgtaaaa actaagtcgt ttaattgcat agtgaaagtt caattgttaa 12780 aaattccgag tcatttaatt gttgcaggtt catcatggtt atccaaaata ataattgccg 12840 gggtgcagtt agcaagtatt tcatatctta tttctatgct aggtgaagag aaatatgcaa 12900 tctttagttt gttaactggt ttattagtat ggtgtagcgc tgttgatttt ggcataggta 12960 caggactgca aaattatata tcagaatgca gagccaaaaa caaaagttat gatgcatata 13020 ttaaatcagc attacatcta agctttatag ctattatttt ttttattgct ttattttata 13080 ttttttctgg ggtaatttcc gctaaatatc tttcttcttt tcatgaggta ttacaggaca 13140 aaaccagaat gctctttttt acctcatgtc tggttttcag ttctattgga atcggagcta 13200 ttgcttataa aatacttttt gccgaattgg tcgggtggaa agctaatcta ttaaacgcat 13260 tatcttatat gataggtatg ctcggcttgc tatatatata ctataggggg atctcagttg 13320 acataaaatt atcactaata gtcctgtatc ttccagtggg tatgatttca ttgtgctata 13380 ttgtatatag atacataaag ctttatcatg ttaaaacaac aaaatctcat tatatagcaa 13440 ttttacgtag atcttcaggg ttttttcttt ttactttatt atcgatagtg gtgcttcaaa 13500 cagattatat ggtcatttct caaaggctaa ctcctgctga tattgttcaa tatacagtaa 13560 cgatgaaaat ttttggttta gtctttttta tttatactgc tattttgcaa gcattatggc 13620 ctatatgtgc tgaattgaga gtcaaacagc aatggaaaaa acttaacaaa atgataggtg 13680 tcaatatttt gcttggctca ctatatgttg ttggatgtac aatatttatt tatttattta 13740 aagaacagat attttcagta atagccaaag atattaatta tcaagtttct attttatctt 13800 ttatgttaat tggcatatat ttctgtattc gcgtttggtg tgacacttat gcaatgttat 13860 tgcaaagtat gaattattta aaaatacttt ggatattagt accactacaa gcaataattg 13920 gtggaatagc acaatggtat ttttctagta cgcttggaat cagtggagtg ctgcttggct 13980 tgattatatc ttttgcttta actgtttttt gggggcttcc actaacttac ttaattaagg 14040 caaataaggg ataatcatat gcttatatca ttttgtattc caacttataa tagaaaacaa 14100 tatcttgaag agttgttgaa tagtataaat aatcaggaaa aatttaattt agatattgag 14160 atatgtatat cagataatgc ctctactgat ggtacagagg aaatgattga tgtttggagg 14220 aacaattata atttcccaat aatatatcgg cgtaatagcg ttaaccttgg gccagatagg 14280 aattttcttg cttcagtatc ccttgcgaat ggggattatt gttggatatt tggcagtgat 14340 gatgctcttg cgaaagactc gttagcgata ttacaaactt atctcgattc tcaagcagat 14400 atatatttat gtgacagaaa agagaccggg tgtgatttag ttgagattag aaaccctcat 14460 cgttcttggc tcagaacaga tgatgaactt tatgtgttta ataataattt agatagggaa 14520 atctatctca gtagatgctt atctattggt ggtgtattta gctatctaag ttctttaata 14580 gtaaaaaaag aacgatggga tgccattgat tttgatgcgt cctatattgg cacttcctat 14640 cctcatgtat ttatcatgat gagcgtattt aatacgccag ggtgcctttt gcattatata 14700 tcaaaaccac tcgtaatatg ccgaggagat aatgatagtt tcgagaagaa aggaaaggcc 14760 agacgaattt taattgattt tattgcatat ttaaaattag ctaatgattt ttacagtaaa 14820 aatatatctt taaaacgagc atttgaaaat gttttgctaa aagagagacc atggttatat 14880 acaactttgg ctatggcatg ttatggcaat agtgatgaaa aaagagattt atctgaattt 14940 tatgcaaagc taggttgtaa taaaaatatg atcaacactg tacttcgatt tgggaaacta 15000 gcatatgcag tgaaaaatat taccgtgctt aagaatttta ctaaacggat aattaagtag 15060 tagtaagtta ttatattgag attaaatgta gatttaacct ttctggattc agctagattt 15120 acgttactga cttttctttt taatgaaaat catatttgat atatataaat aaatttggat 15180 agcttaacta cttagatgtt tttttctggg aatgttagta taataatata tttctttatg 15240 attgtttttg tagtgtttta ctgccggtat tacattaact ctattattaa gaattacacc 15300 tagtgtaagc ttcgtaatat tatttatcct tatgattatt gctttaaaga tgcgtatgga 15360 aaaacggaga gctattcaat gatcgtaaac ctatcacgtt taggtaaaag tggtacggga 15420 atgtggcaat actcgattaa atttttaacg gcactgcgag aaatagctga tgttgacgca 15480 ataatctgta gcaaggtaca cgctgattat tttgaaaagc tcggttatgc agtagttact 15540 gttccgaata ttgttagcaa cacatcaaaa acatcgcgac ttagaccatt agtatggtat 15600 gtatatagtt actggcttgc gctgagggtt ttaattaagt ttggtaataa aaaattggtg 15660 tgtactacac atcacactat ccccttactg agaaaccaaa cgataaccgt acatgatata 15720 agaccttttt attatccaga tagttttatt cagaaagtgt attttcgctt tttattaaaa 15780 atgtccgtta agcgatgtaa gcatgtttta acggtatctt ataccgttaa agatagcatt 15840 gctaaaactt ataatgtaga tagtgagaaa atatcagtaa tttataatag tgttaataaa 15900 tctgatttta tacaaaaaaa agaaaaagag aattactttt tagctgttgg tgcaagttgg 15960 ccacataaaa atattcattc attcataaaa aataaaaaag tttggtctga ctcttataat 16020 ttaattattg tatgtggtcg tactgactat gcaatgtctc tccaacaaat ggtcgttgat 16080 ctggaactaa aagataaagt gactttttta catgaagtct catttaatga attaaagatt 16140 ttatattcta aagcctacgc gcttgtttat ccatctattg atgagggttt tggtatacct 16200 cctattgaag cgatggcatc aaatactcca gttatagtgt ccgatatacc agtatttcat 16260 gaagtgttaa ccaatggtgc attatatgtg aatccggatg atgaaaaaag ctggcagagt 16320 gcaattaaaa atatagagca gttgcctgat gcaatttccc gatttaacaa ctatgtcgca 16380 cggtatgact ttgataatat gaagcagatg gttggcaatt ggttggcgga atcaaaataa 16440 atgaaaataa cattaattat tcccacatat aatgcagggt cgctttggcc taatgttctg 16500 gatgcgatta agcagcaaac tatatatccg gataaattga ttgttataga ctcaggttct 16560 aaagatgaaa cggttccgtt agcctcagac ctgaaaaata tatcaatatt taatattgac 16620 tctaaagatt ttaatcatgg aggaaccaga aatttagcag ttgcaaaaac tctggacgct 16680 gatgttataa tttttctaac gcaagatgca attctcgcgg attcggatgc aattaaaaat 16740 ttggtttatt atttttcaga tccattgata gcagcggttt gtggtagaca acttcctcat 16800 aaagatgcta atccccttgc agtgcatgcc agaaatttta attatagttc aaaatctatt 16860 gttaaaagta aggcagatat agaaaaattg ggtattaaaa ctgtatttat gtccaattct 16920 tttgctgcct atcgccgttc cgtttttgaa gagttaagtg ggtttcctga acatacaatt 16980 cttgccgagg atatgtttat ggcggctaag atgattcagg cgggttataa ggtcgcctac 17040 tgcgctgaag cggtggtaag acactcccat aattataccc cgcgagaaga gtttcaacga 17100 tattttgata ctggtgtatt tcatgcttgt tctccgtgga ttcagcgtga ctttggcgga 17160 gccggtggtg agggtttccg cttcgtaaaa tcagagattc aattcctgct taaaaatgca 17220 ccgttctgga ttccaagagc tttattaaca acctttgcta aattcttggg ttacaaatta 17280 ggcaagcatt ggcaatcttt accgttgtct acatgtcgct attttagcat gtacaagagt 17340 tattggaata atatccaata ttcttcgtca aaagagataa aataaatgtc ttttcttccc 17400 gtaattatgg ctggcggcac aggtagccgt ttatggccgc tttcacgcga atatcatccg 17460 aagcagtttc taagcgttga aggtaaacta tcaatgctgc aaaatactat aaagcgatta 17520 gcttcacttt ctacagaaga acccgttgtc atttgcaatg acagacaccg tttcttagtc 17580 gctgaacaac tccgtgaaat tgacaagtta gcaaataata ttattctcga accggtaggc 17640 cgtaatactg caccagcgat cgctcttgcc gcgttttgtg cgctccagaa tgctgataat 17700 gctgatcctc ttttgttggt tcttgctgca gatcatgtga ttcaggatga aatagctttt 17760 acgaaagctg tcagacatgc tgaagaatac gctgcaaatg gtaagcttgt aacttttggt 17820 attgttccaa cgcatgctga aacgggttat ggatatattc gtcgtggtga gttgatagga 17880 aatgacgctt atgcagtggc tgaatttgtg gagaaaccgg atatcgatac cgccggtgac 17940 tatttcaaat cagggaaata ttactggaat agcggtatgt ttttatttcg tgcaagctct 18000 tatttaaacg aattaaagta tttatcacct gaaatttata aagcttgtga aaaggcggta 18060 ggacatataa atcccgatct tgattttatt cgtattgata aagaagagtt tatgtcatgc 18120 ccgagtgatt ctatcgatta tgcagttatg gagcacacac agcatgcggt ggtgatacca 18180 atgagcgctg gctggtcgga tgtgggttcc tggtcctcac tttgggatat atcgaataaa 18240 gatcatcaga gaaatgtttt aaaaggagat attttcgcac atgcttgtaa tgataattac 18300 atttattccg aagatatgtt tataagtgcg attggtgtaa gcaatcttgt cattgttcaa 18360 acaacagacg ctttactggt ggctaataaa gatacagtac aagatgttaa aaaaattgtc 18420 gattatttaa aacggaatga taggaacgaa tataaacaac atcaagaagt tttccgcccc 18480 tggggaaaat ataatgtgat tgatagcggc aaaaattacc tcgttcgatg tatcactgtt 18540 aagccgggtg agaaatttgt ggcgcagatg catcaccacc gggctgagca ttggatagta 18600 ttatccggga ctgctcgtgt tacaaaggga gagcagactt atatggtttc tgaaaatgaa 18660 tcaacattta ttcctccgaa tactattcac gcgctggaaa atcctggaat gacccccctg 18720 aagttaattg agattcaatc aggtacctat cttggtgagg atgatattat tcgtttagaa 18780 caacgttctg gattttcgaa ggagtggact aatgaacgta gttaataata gccgtgatgt 18840 tatttattca tcaggtattg tgtttggaac gagtggggct cgcggtcttg taaaagattt 18900 tacacctcag gtatgtgctg cttttacggt ttcatttgtt gccgttatgc aggaacattt 18960 ttcctttgat accgtagcat tggcaataga taatcgtcca agtagttatg ggatggctca 19020 ggcgtgtgct gctgcattgg cggataaagg cgttaactgt attttttatg gagtggtacc 19080 aaccccagct ttggcctttc agtctatgtc tgacaatatg cctgcgataa tggttacggg 19140 aagtcatatt ccattcgagc ggaacggcct caagttttat cgtcctgatg gtgaaatcac 19200 gaaacatgat gaggctgcga tccttagtgt tgaagatacg tgcagccatt tagagcttaa 19260 agaactcata gtttcagaaa tggctgctgt taattatata tctcgttata catctttatt 19320 ttctactcca ttcctgaaaa ataagcgtat tggtatttac gaacattcaa gcgctgggcg 19380 tgatctttat aagcctttat ttattgcatt gggggctgaa gtcgttagct tgggtagaag 19440 cgataatttt gtacctatag atacagaggc tgtaagcaaa gaggatcggg aaaaagctcg 19500 ctcatgggct aaagagttcg atttagatgc catattctcg acagatgggg atggtgatcg 19560 ccctcttatt gctgatgagg ccggtgagtg gctaagaggc gatatactag gtctattatg 19620 ttcacttgca ttggatgcag aagccgtcgc tattcctgtt agttgtaaca gcataatttc 19680 ttctggccgc ttttttaaac atgttaagct tacaaaaatt ggctcgcctt atgttatcga 19740 agcttttaat gaattatcgc ggagttatag tcgtattgtc ggttttgaag ccaatggcgg 19800 ttttttatta ggaagcgaca tctgtattaa cgagcagaat cttcatgcct taccaactcg 19860 tgatgctgta ttaccagcaa taatgctgct ttacaaaagt aggaatacca gcattagcgc 19920 tttagtcaat gaactcccaa ctcgttacac ccattctgac agattacagg ggattacaac 19980 tgataaaagt caatccttaa ttagtatggg cagagaaaat ctgagcaacc tcttaagcta 20040 tattggtttg gagaatgaag gtgcaatttc tacagatatg acagatggta tgcgaattac 20100 tttacgtgat ggatgtattg tgcatttgcg cgcttctggt aatgcacctg agttacgctg 20160 ctatgcagaa gctaatttat taaatagggc tcaggatctt gtaaatacaa cgcttgctaa 20220 tattaaaaaa cgatgcttgc tgtaaaaaaa ttgaatgtta tttacttaat atgcctattt 20280 tatttacatt atgcacggtc agagggtgag gattaaatgg ataatattga taataagtat 20340 aatccacagc tatgtaaaat ttttttggct atatcggatt tgattttttt taatttagcc 20400 ttatggtttt cattaggatg tgtctatttt atttttgatc aagtacagcg atttattcct 20460 caagaccaat tagatacaag agttattacg cattttattt tgtcagtagt atgtgtcggt 20520 tggttttgga ttcgtttgcg acattatact atccgcaagc cattttggta tgagttaaaa 20580 gaaatttttc gtacgatcgt tatttttgct atatttgatt tggctctgat agcgtttaca 20640 aaatggcagt tttcacgcta tgtctgggtg ttttgttgga cttttgccct aatcctggtg 20700 cctttttttc gcgcacttac aaagcattta ttgaacaagc taggtatctg gaagaaaaaa 20760 actatcatcc tggggagcgg acagaatgct cgtggtgcat attctgcgct gcaaagtgag 20820 gagatgatgg ggtttgatgt tatcgctttt tttgatacgg atgcgtcaga tgctgaaata 20880 aatatgttgc cggtgataaa ggatactgag attatttggg atttaaatcg tacaggtgat 20940 gtccattata tccttgctta tgaatacacc gagttggaga aaacacattt ttggctacgt 21000 gaactttcaa aacatcattg tcgttctgtt actgtagtcc cctcgtttag aggattgcca 21060 ttatataata ctgatatgtc ttttatcttt agccatgaag ttatgttatt aaggatacaa 21120 aataacttgg ctaaaaggtc gtcccgtttt ctcaaacgga catttgatat tgtttgttca 21180 ataatgattc ttataattgc atcaccactt atgatttatc tgtggtataa agttactcga 21240 gatggtggtc cggctattta tggtcaccag cgagtaggtc ggcatggaaa actttttcca 21300 tgctacaaat ttcgttctat ggttatgaat tctcaagagg tactaaaaga acttttggct 21360 aacgatccta ttgccagggc tgaatgggag aaagatttta aactgaaaaa tgatcctcga 21420 atcacagctg taggtcgatt tatacgtaaa actagccttg atgagttgcc acaacttttt 21480 aatgtactaa aaggtgatat gagcctggtt ggaccacgac ctatcgtttc ggatgaactg 21540 gagcgttatt gtgatgatgt tgattattat ttgatggcaa agccgggcat gacaggtcta 21600 tggcaagtga gtgggcgtaa tgatgttgat tatgacactc gtgtttattt tgattcctgg 21660 tatgttaaaa actggacgct ttggaatgat attgccattc tgtttaaaac agcgaaagtt 21720 gttttgcggc gagatggtgc gtattaagct taccgagaag tactgaataa taattgtata 21780 aattagcctg cgtaaaatct gaacgcatca atcgctacct taatatcata cctttgagtt 21840 aacatactat tcacctttaa cctgccatga ccgtttgtgg cagggtttcc acacctgaca 21900 ggagtatgta atgtccaagc aacagatcgg cgtcgtcggt atggcagtga tggggcgcaa 21960 cctcgcgctc aacatcgaaa gccgtggtta taccgtctcc gttttcaacc gctcccgtga 22020 aaagaccgaa gaagtgattg ccgagaatcc cggcaaaaag ctggtgcctt attacacggt 22080 

The claims:
 1. A nucleic acid molecule derived from: a gene encoding a transferase; or a gene encoding an enzyme for the transport or processing of a polysaccharide or oligosaccharide unit, including a wzx gene or a wzy gene, or a gene with a similar function; the gene being involved in the synthesis of a particular bacterial polysaccharide antigen, wherein the sequence of the nucleic acid molecule is specific to the particular bacterial polysaccharide antigen.
 2. A nucleic acid molecule derived from: a gene encoding a transferase; or a gene encoding an enzyme for the transport or processing of a polysaccharide or oligosaccharide unit such as a wzx or wzy gene; the gene being involved in the synthesis of a particular bacterial o antigen, wherein the sequence of the nucleic acid molecule is specific to the particular bacterial O antigen.
 3. A nucleic acid molecule derived from: a gene encoding a transferase; or a gene encoding an enzyme for the transport or processing of a polysaccharide or oligosaccharide unit such as a wzx or wzy gene; the gene being involved in the synthesis of an O antigen expressed by E. coli, wherein the sequence of the nucleic acid molecule is specific to the O antigen.
 4. A nucleic acid molecule derived from a gene encoding a transferase; or a gene encoding an enzyme for the transport or processing of a polysaccharide or oligosaccharide unit such as a wzx or wzy gene; the gene being involved in the synthesis of an O antigen expressed by S. enterica, wherein the sequence of the nucleic acid molecule is specific to the O antigen.
 5. A nucleic acid molecule according to any one of claims 1 to 4 wherein the nucleic acid molecule is approximately 10 to 20 nucleotides in length.
 6. A nucleic acid molecule derived from a gene, the gene being selected from a group consisting of the following sequences: nucleotide position 739 to 1932 of SEQ ID NO:1; nucleotide position 8646 to 9911 of SEQ ID NO:1; nucleotide position 9901 to 10953 of SEQ ID NO:1; nucleotide position 11821 to 12945 of SEQ ID NO:1; nucleotide position 79 to 861 of SEQ ID NO:2; nucleotide position 858 to 2042 of SEQ ID NO:2; nucleotide position 2011 to 2757 of SEQ ID NO:2; nucleotide position 2744 to 4135 of SEQ ID NO:2; nucleotide position 5257 to 6471 of SEQ ID NO:2; and nucleotide position 13156 to 13821 of SEQ ID NO:2; which nucleic acid molecule is capable of hybridizing to complementary sequence from said gene.
 7. A nucleic acid molecule which is any one of the oligonucleotides in Table 5 or 5A, with respect to the genes wbdh, wzx, wzy and wbdM.
 8. A nucleic acid molecule which is any one of the oligonucleotides in Table 6 or 6A.
 9. A nucleic acid molecule derived from a gene, the gene being selected from a group consisting of the following sequences: nucleotide position 1019 to 2359 of SEQ ID NO:3; nucleotide position 2352 to 3314 of SEQ ID NO:3; nucleotide position 3361 to 3875 of SEQ ID NO:3; nucleotide position 3977 to 5020 of SEQ ID NO:3; nucleotide position 5114 to 6313 of SEQ ID NO:3; nucleotide position 6313 to 7323 of SEQ ID NO:3; nucleotide position 7310 to 8467 of SEQ ID NO:3; nucleotide position 12762 to 14054 of SEQ ID NO:4; and nucleotide position 14059 to 15060 of SEQ ID NO:4; which nucleic acid molecule is capable of hybridizing to complementary sequences from said gene.
 10. A nucleic acid molecule which is any one of the oligonucleotides in Table
 7. 11. A nucleic acid molecule which is any one of the oligonucleotides in Table 8 with respect to the genes wzx and wba V.
 12. A method of testing a sample for the presence of one or more bacterial polysaccharide antigens, the method comprising the following steps: (a) contacting the sample with at least one oligonucleotide molecule capable of specifically hybridising to: (i) a gene encoding a transferase, or (ii) a gene encoding an enzyme for transport or processing of oligosaccharide or polysaccharide units, including a wzx or wzy gene; wherein said gene is involved in the synthesis of the bacterial polysaccharide antigen; under conditions suitable to permit the at least one oligonucleotide molecule to specifically hybridise to at least one such gene of any bacteria expressing the bacterial polysaccharide antigen present in the sample and (b) detecting any specifically hybridised oligonucleotide molecules.
 13. The method according to claim 12, the method further comprising contacting the sample with a further at least one oligonucleotide molecule capable of specifically hybridising to at least one sugar pathway gene under conditions suitable to permit the further at least one oligonucleotide molecule to specifically hybridise to at least one such sugar pathway gene of any bacteria expressing the bacterial polysaccharide antigen present in the sample and detecting any specifically hybridised oligonucleotide molecules.
 14. A method of testing a sample for the presence of one or more bacterial polysaccharide antigens, the method comprising the following steps: (a) contacting the sample with at least one pair of oligonucleotide molecules, with at least one oligonucleotide molecule of the pair capable of specifically hybridising to: (i) a gene encoding a transferase, or (ii) a gene encoding an enzyme for transport or processing of oligosaccharide or polysaccharide units, including a wzx or wzy gene; wherein the gene is involved in the synthesis of the bacterial polysaccharide antigen; under conditions suitable to permit the at least one oligonucleotide molecule of the pair of molecules to specifically hybridise to at least such gene of any bacteria expressing the bacterial polysaccharide antigen present in the sample and (b) detecting any specifically hybridised oligonucleotide molecules.
 15. The method according to claim 14, the method further comprising contacting the sample with a further at least one pair of oligonucleotide molecules, with at least one oligonucleotide molecule of the pair capable of specifically hybridising to at least one sugar pathway gene under conditions suitable to permit the further at least one oligonucleotide molecule of the pair to specifically hybridise to at least one such sugar pathway gene of any bacteria expressing the bacterial polysaccharide antigen present in the sample and detecting any specifically hybridised oligonucleotide molecules.
 16. A method of testing a sample for the presence of one or more bacterial O antigens, the method comprising the following steps: (a) contacting the sample with at least one oligonucleotide molecule capable of specifically hybridising to: (i) a gene encoding an O antigen transferase, or (ii) a gene encoding an enzyme for transport or processing of the oligosaccharide or polysaccharide units, including a wzx or wzy gene; wherein said gene is involved in the synthesis of the bacterial o antigen; under conditions suitable to permit the at least one oligonucleotide molecule to specifically hybridise to at least one such gene of any bacteria expressing the bacterial O antigen present in the sample and (b) detecting any specifically hybridised oligonucleotide molecules.
 17. The method according to claim 16, the method further comprising contacting the sample with a further at least one oligonucleotide molecule capable of specifically hybridising to at least one sugar pathway gene under conditions suitable to permit the further at least one oligonucleotide molecule to specifically hybridise to at least one such sugar pathway gene of any bacteria expressing the bacterial O antigen present in the sample and detecting any specifically hybridised oligonucleotide molecules.
 18. The method according to claim 16 or 17 wherein the O antigen is expressed by E. coli or S. enterica.
 19. The method according to claim 18 wherein the E. coli express the 0157 O antigen serotype or the 0111 O antigen serotype.
 20. The method according to claim 18 wherein the S. enterica express the C2 or B O antigen serotype.
 21. The method according to any one of claims 16 to 20 wherein the specifically hybridised oligonucleotide molecules are detected by Southern blot analysis.
 22. A method of testing a sample for the presence of one or more bacterial O antigens, the method comprising the following steps: (a) contacting the sample with at least one pair of oligonucleotide molecules, with at least one oligonucleotide molecule of the pair being capable of specifically hybridising to: (i) a gene encoding an O antigen transferase, or (ii) a gene encoding an enzyme for transport or processing of oligosaccharide or polysaccharide units, including a wzx or wzy gene; wherein the gene is involved in the synthesis of the bacterial O antigen; under conditions suitable to permit the at least one oligonucleotide molecule of the pair of molecules to specifically hybridise to at least one such gene of any bacteria expressing the bacterial O antigen present in the sample and (b) detecting any specifically hybridised oligonucleotide molecules.
 23. The method according to claim 22, the method further comprising contacting the sample with a further at least one pair of oligonucleotide molecules, with at least one oligonucleotide molecule of the pair capable of specifically hybridising to at least one sugar pathway gene under conditions suitable to permit the further at least one oligonucleotide molecule of the pair to specifically hybridise to at least one such sugar pathway gene of any bacteria expressing the bacterial O antigen present in the sample and detecting any specifically hybridised oligonucleotide molecules.
 24. The method according to claim 22 or 23 wherein the O antigen is expressed by E. coli or S. enterica.
 25. The method according to claim 24 wherein the E. coli are 0111 or the 0157 O antigen serotype.
 26. The metihod according to claim 24 wherein the S. enterica express the C2 or B O antigen serotype.
 27. The method according to any one of claims 22 to 26 wherein the method is performed according to the polymerase chain reaction method.
 28. The method according to any one of claims 22 to 26 wherein the oligonucleotide molecules are selected from the group of nucleic acid molecules according to any one of claims 5 to
 11. 29. A method for testing a food derived sample for the presence of one or more particular bacterial O antigens, the method being according to any one of claims 16 to
 28. 30. A method for testing a faecal derived sample for the presence of one or more particular bacterial O antigens, the method being according to any one of claims 16 to
 28. 31. A method for testing a sample derived from a patient for the presence of one or more particular bacterial O antigens, the method being according to any one of claims 16 to
 28. 32. A kit comprising a first vial containing a first nucleic acid molecule capable of specifically hybridising to: (i) a gene encoding a transferase, or (ii) a gene encoding an enzyme for transport or processing oligosaccharide or polysaccharide units, including a wzx or wzy gene, wherein said gene is involved in the synthesis of a bacterial polysaccharide.
 33. The kit according to claim 32 further comprising in the first vial, or in a second vial, a second nucleic acid molecule capable of specifically hybridising to: (i) a gene encoding a transferase, or (ii) a gene encoding an enzyme for transport or processing oligosaccharide or polysaccharide units, including a wzx or wzy gene, wherein said gene is involved in the synthesis of a bacterial polysaccharide, and wherein the sequence of the second nucleic acid molecule is different from the sequence of the first nucleic acid molecule.
 34. The kit according to claim 33 further comprising a nucleic acid molecule derived from a sugar pathway gene.
 35. A kit according to claim 32 further comprising in the first vial, or in a second vial, a second nucleic acid molecule capable of specifically hybridising to a sugar pathway gene.
 36. A kit according to any one of claims 32 to 35 wherein the nucleic acid molecules are approximately 10 to 20 nucleotides in length.
 37. A kit comprising a first vial containing a first nucleic acid molecule capable of specifically hybridising to: (i) a gene encoding a transferase, or (ii) a gene encoding an enzyme for transport or processing oligosaccharide or polysaccharide units, including a wzx or wzy gene, wherein said gene is involved in the synthesis of a bacterial O antigen.
 38. The kit according to claim 37, further comprising in the first vial, or in a second vial, a second nucleic acid molecule capable of specifically hybridising to: (i) a gene encoding a transferase, or (ii) a gene encoding an enzyme for transport or processing oligosaccharide or polysaccharide units, including a wzx or wzy gene, wherein said gene is involved in the synthesis of a bacterial O antigen, and wherein the sequence of the second nucleic acid molecule is different from the sequence of the first nucleic acid molecule.
 39. A kit according to claim 37 further comprising in the first vial, or in a second vial, a second nucleic acid molecule capable of specifically hybridising to a sugar pathway gene.
 40. The kit according to claim 38 further comprising a nucleic acid molecule derived from a sugar pathway gene.
 41. The kit according to any one of claims 37 to 40 wherein the nucleic acid molecules are approximately 10 to 20 nucleotides in length.
 42. The kit according to any one of claims 31 to 34 wherein the first and second nucleic acid molecules are according to any one of claims 5 to
 11. 